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Introduction 


This book is the first of three volumes of a full and detailed account of those 
elements of real and complex analysis that mathematical undergraduates 
may expect to meet in the first two years or so of the study of analysis. 
This volume is concerned with the analysis of real-valued functions of a real 
variable. Volume II considers metric and topological spaces, and functions of 
several variables, while Volume III is concerned with complex analysis, and 
with the theory of measure and integration. 

Mathematical analysis depends in a fundamental way on the properties of 
the real numbers, and indeed much of analysis consists of working out their 
consequences. It is therefore essential to develop a full understanding of these 
properties. There are two ways of doing this. The traditional and appropriate 
way is to take the fundamental properties of the real numbers as axioms — the 
real numbers form an ordered field in which every non-empty subset which 
has an upper bound has a least upper bound — and to develop the theory — 
convergence, continuity, differentiation and integration — from these axioms. 
This programme is carried out in Part Two. This theory is meant to be used, 
and Part Two ends with an extensive collection of applications. The reader 
is strongly recommended to follow this tradition, and to begin at 
the beginning of Part Two. 

It is however right to ask about the foundations on which these axioms, 
and the rest of mathematical analysis, are built. These foundations are con- 
sidered in the Prologue. In the twentieth century, analysis was placed in 
a set-theoretic setting, and it is worth understanding what this involves. 
Chapter 1 contains an account of Zermelo—Fraenkel set theory, together 
with a brief discussion of the axiom of choice and its variants. The 
Zermelo—Fraenkel axioms lead naturally to the construction of the natural 
numbers. In Chapter 2 it is shown that there is then a steady progression 
through the integers and the rational numbers to the real numbers and the 
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complex numbers. The problem with the natural numbers, the integers and 
the rational numbers is that they are very familiar; this part of the journey 
may appear to be spent proving the obvious. The construction of the real 
numbers is a quite different matter. There are many possible constructions, 
but we describe the first, given by Richard Dedekind. This has great virtue, 
since it involves both order and metric properties of the rational numbers and 
of the real numbers. The reader is urged to defer a detailed reading of 
the Prologue until the occasion demands, for example when it becomes 
clear how important the fundamental properties of the real numbers are, or 
when it is important to consider carefully the role of induction, recursion and 
the axiom of dependent choice. 

The text includes plenty of exercises. Some are straightforward, some are 
searching, and some contain results needed later. Many concern applications, 
and all help develop an understanding of the theory: do them! 

I have worked hard to remove errors, but undoubtedly some remain. Cor- 
rections and further comments can be found on a web page on my personal 
home page at www.dpmms.cam.ac.uk. 


Part One 


Prologue: The foundations of analysis 


1 


The axioms of set theory 


It is probably sensible to read through this chapter fairly quickly, to find out 
the terminology and notation that we shall use, and then to return later to 
read it and think about it more carefully. 


1.1 The need for axiomatic set theory 


Mathematics is written in many languages, such as French, German, Russian, 
Chinese, and, as in the present case, English. Mathematics needs a particular 
precision, and within each of these languages, most of mathematics, and all 
the mathematics that we shall do, is written in the language of sets, using 
statements and arguments that are based on the grammar and logic of the 
predicate calculus. In this chapter we introduce the set theory that we shall 
use. This provides us with a framework in which to work; this framework 
includes a model for the natural numbers (1,2,3,...), together with tools 
to construct all the other number systems (rational, real and complex) and 
functions that are the subject of mathematical analysis. 

The predicate calculus involves rules of grammar for writing ‘well-formed 
formulae’, and for providing mathematical arguments which use them. Well- 
formed formulae involve variables, and logical operations such as conjunction 
(P and Q), disjunction (P or Q (or both)), implication (P implies Q), nega- 
tion (not P), and quantifiers ‘there exists’ and ‘for all’, together, in our case, 
with sets and the relation €. We shall not describe the predicate calculus, 
which formalizes the everyday use of these logical operations (for example, 
‘P implies Q’ if and only if ‘(not Q) implies (not P)’), but all our arguments 
and constructions will be based on it, and we shall give plenty of examples 
of well-formed formulae.! 


1 For a good account, see A. G. Hamilton, Logic for Mathematicians, Cambridge University Press, 
1988. 


4 The axioms of set theory 


Since the beginning of the study of set theory by Cantor in the 1870s 
and the introduction of Venn diagrams by Venn in 1881, the simple idea 
of a set has become commonplace, and young children happily manipulate 
sets such as {Catherine of Aragon, Ann Boleyn, Jane Seymour, Anne of 
Cleves, Kathryn Howard, Katherine Parr}, or more prosaically {Alice, Bob}, 
or the set of numbers {5, 13, 17, 29,37, 41,53, 61, 73, 89}. In mathematics, we 
consider sets of mathematical objects, such as the last of these examples. Can 
we not simply consider a mathematical object to be a collection of all those 
things which can be defined by a well-formed formula? Then a set would be 
something of the form ‘the collection of those things a for which the well- 
formed formula P(a) holds’, where P(x) is a well-formed formula with one 
free variable x, and conversely, each such formula would define a set. This 
approach is known as the comprehension principle. Unfortunately, it leads 
to contradictions. Consider the well-formed statement ‘x does not belong to 
x’; according to the comprehension principle, there should be a set b which 
consists of those sets which do not belong to themselves. Does b belong to b? 
If it does, it fails the criterion for belonging to b, and so it does not belong to 
b. But if it does not belong to b, then it meets the criterion, and so it belongs 
to 6. Thus, either way, we reach a contradiction. 

This phenomenon was described by Bertrand Russell in 1901, and is known 
as Russell’s paradox. It caused him a great deal of pain, as he described in 
his autobiography.” Concerning the events of May 1901, he wrote 


Cantor had a proof that there is no greatest number, and it seemed 
to me that the number of things in the world should be the greatest 
possible. Accordingly, I examined his proof with some minuteness, 
and endeavoured to apply it to the class of all things there are. 
This led me to consider those classes which are not members of 
themselves, and to ask whether the class of all such classes is or 
is not a member of itself. I found that either answer implied its 
contradictory. 


He continued to consider the problem for several years. Describing the 
summers of 1903 and 1904, he wrote 


I was trying hard to solve the contradictions mentioned above. 
Every morning I would sit down before a blank sheet of paper. 
Throughout the day, with a brief interval for lunch, I would stare 
at the blank sheet. Often when evening came it was still empty. 


Russell’s paradox required a new approach to the theory of sets, which 
would provide a framework where Russell’s paradox, and other paradoxes, 


2 The Autobiography of Bertrand Russell, George Allen and Unwin, 1967-69. 
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are avoided. In 1908, Zermelo introduced a system of axioms; these were 
modified in 1922 by Fraenkel and Skolem. The resulting system, known as 
the Zermelo--Fraenkel axiom system ZF, has stood the test of time, and it is 
the one that we shall describe and use. 


1.2 The first few axioms of set theory 


In Zermelo—Fraenkel set theory, the basic objects are all called sets, denoted 
by upper- or lower-case letters, and there is one relation, €. Thus, if a and 
b are sets, then either a € 6, or this is not so, in which case we write a ¢ b. 
(We use the symbol / to mean ‘not’, in a similar way, for other relations.) If 
a € b, we say that a belongs to b, or that a is a member or element or point of 
b, or, more simply, that a is in b. 

The sets and the relation € are required to satisfy certain axioms, and we 
shall spend the rest of this chapter introducing and explaining them. 


Axiom 1: The extension axiom 


This states that two sets are equal if and only if they have the same elements. 
Thus the set with members 1, 2 and 3 and the set with members 1, 3, 2 and 
1 are the same; the order in which they are listed is unimportant, as is the 
fact that repetition can occur. Set theory is all about membership, and about 
nothing else. 

If a and 6 are sets, and every member of a is a member of 6b, then we say 
that a is a subset of b, or that b contains a, and write a C bor b D a. Thus the 
extension axiom says that a = 6 if and only if a C band b Ca. Ifa C band 
a # b, we say that ais a proper subset of b, or that a is properly contained in 
b, and write a C bor bDa. 


Axiom 2: The empty set axiom 


This states that there is a set with no members. The extension axiom then 
implies that there is only one such set: we denote it by @ and call it the empty 
set. It is easy to overlook the empty set: arguments involving it take on an 
idiosyncratic form. It also has a rather paradoxical nature, since it is a subset 
of every set a (if not, there is a member 6 of @ which is not in a; but @ has 
no members). Thus (looking ahead to some familiar sorts of sets) we can 
consider the set F’ of natural numbers n greater than 2 for which there exist 
natural numbers a, b and c with a” + b” = c”, and we can consider the set 
Q of those complex quadratic polynomials of the form z? + az + b for which 
the equation z? + az +b = 0 has no complex solutions. Then F = Q, since 
each is the empty set. 


6 The axioms of set theory 


The next four axioms are concerned with creating new sets from old. 


Axiom 3: The pairing axiom 

This says that if a and b are sets then there exists a set whose members are 
a and b. The extension axiom again says that there is only one such set: we 
denote it by {a,b}. Note that {a,b} = {b, a}: we have an unordered pair. We 
can take a = b: then the set {a,a} has only one element a. We write this set 
as {a} and call it a singleton set. 

We can use the pairing axiom to define ordered pairs. If a and 0 are sets, 
we define the ordered pair (a,b) to be the set {{a}, {a, b}}. 


Proposition 1.2.1 If (a,b) and (c,d) are ordered pairs and (a,b) = (c,d), 
thena=c and b=d. 


Proof The proof makes repeated use of the extension axiom. First, suppose 
that a = b. Then (a,b) = {{a}} = {{c}, {c,d}}, and so {c,d} = {a}, and 
a=c=d. Thusa=b=c=d. Similarly, ifc=dthena=b=c=d. 
Finally, suppose that a 4 b and c ¥ d. Since {a} € (c,d), either {a} = {c} 
or {a} = {c,d}. But if {a} = {c,d} then c = a = d, giving a contradiction. 
Thus {a} = {c} and a = c. Since {a,b} € (c,d), either {a,b} = {c} or 
{a,b} = {c,d}. But if {a,b} = {c}, then a = c = b, giving a contradiction. 
Thus {a,b} = {c,d}, and so b = cor b = d. But ifb =cthenb=c =a, 
giving a contradiction. Thus 6 = d. 


If A is a set, then all its members are sets, and they, in turn, can have 
members. 


Axiom 4: The union axiom 


This says that there is a set whose elements are exactly the sets which are 
members of members of A. We denote this set by Uge aa (here a is a variable, 
so we could as well write Uze 42x) and call it the union of the members of A. 
The essential feature of this axiom is that the sets whose members make up 
the union must all be members of a single set; we cannot form the union of all 
sets since, as we shall see, there is no set to which all sets belong. If A and B 
are sets, we can consider the set Uce,4,5}C. This is the set whose elements 
are either in A or in B: we write this as AU B. 


Axiom 5: The power set axiom 


There is an essential difference between the statements b € A (b is a member 
of A) and b C A (bis a subset of A). The power set axiom states that if A 
is a set, then there exists a set, the power set P(A) of A, whose elements 
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are the subsets of A. Thus 6 € P(A) if and only if b C A. For example, 
the elements of P({a,b}) are 0, {a}, {b} and {a,b}, and the ordered pair 
(a,b) = {{a}, {a, b}} is an element of P(P({a, b})). 


Axiom 6: The separation axiom 


This is particularly important, and is an axiom that is used all the time in 
mathematics. It states that if A is a set and Q(z) is a well-formed formula, 
then there exists a subset of A whose elements are just those members a of A 
for which Q(a) holds. By extensionality, there is only one such set; we denote 
it by {a € A: Q(x)}. With this axiom in place, we can use the argument of 
Russell’s paradox to show that there is no universal set to which every set 
belongs. 


Theorem 1.2.2 There is no set Q such that if a is a set thenaeQ. 


Proof Suppose that such a set were to exist. Then the formula xz ¢ x isa 
well-formed formula, and so there exists a set b = {x € Q: x ¢ x}. Does 
b € b? If it does, it fails the criterion for membership, giving a contradiction. 
If it does not, then it meets the criterion, and so belongs to b, giving another 
contradiction. This exhausts all possibilities, and so no such universal set can 
exist. 


Let us give some more examples of the use of the separation axiom. Suppose 
that A and B are sets. The expression x € B is a well-formed formula, and 
so the set {cx € A: x € B} is a subset of A, the intersection of A and B, 
denoted by AN B. Note that ANB=BNA={xeE B:2€ A}, sincea 
set c is an element of either intersection if and only if it belongs to both A 
and B. We say that A and B are disjoint if AN B =; A and B are disjoint 
if A and B have no member in common. Similarly, the expression x ¢ B is 
a well-formed formula, and so the set {2 € A: x ¢ B} is a subset of A, the 
set difference A \ B. A \ B is also called the relative complement of B in A. 
It frequently happens that we consider a particular set A, say, and are only 
concerned with subsets of A. In this case, if B C A, then we denote A \ B by 
C(B), or B®, and call it the complement of B. 

We can extend the notion of intersection considerably. Suppose that A is 
a set. The expression ‘for all a € A, x € a’ is a well-formed formula with aa 
bound variable and x a free variable, and so we can form the set 


{x € Useaa: for alla € A,x € a}. 
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This is the intersection Nacaa of all the sets a that belong to A: b € Nae4a 
if and only if b € a, for each a € A. Here again a is a variable, and we could 
also write Nyc 4x. We must reconcile the two definitions of intersection that 
we have made: this is easy because AM B= Opes 4,B} 2. 

A word about notation here. Our aim will be to be accurate and clear 
without being pedantic. Suppose that A is a set. For each a € A, we can form 
the intersection Ngcqa. Using the separation axiom, we can then define the 
set I whose elements are exactly these intersections, and can then form the 
set Ujez7. In fact, we write this in the form 


Use A(Naea®); 


and use other similar expressions. In the same way, we shall use natural 
variations of the notation {x € A : Q(x)} to denote sets whose existence 
is ensured by the separation axiom; but in each case such a set is a sub- 
set of a given set, and it can be written, at greater length, in the form 
{xEA: Q(ax)}. 

From now on, we shall define sets without appealing to the axioms to ensure 
that they are in fact sets. It is a useful exercise for the reader to consider, in 
each case, how suitable justification can be given. 

It is unfortunately the case that the separation axiom is not strong enough 
for all purposes, and another axiom, the replacement axiom, is needed. We 
shall defer discussion of this and of the other axioms of ZF, until later. Let 
us first see what we can do with the axioms that we now have. 


Exercises 


Suppose that A, B,C, D are sets. 


1.2.1 Show that AU(BNC)=(AUB)N(AUC). 

1.2.2 Show that AN(BUC) =(AN B)U(ANC). 

1.2.3 Show that A\ (BUC) =(A\ B)N(A\C). 

1.2.4 Which of the following statements are necessarily true? 
(a) P(AN B) = P(A)N P(B). 
(b) P(AU B) = P(A) U P(B). 

1.2.5 Define a set I such that Ujert = Uae a(Maea®)- 

1.2.6 Does Ugea(Naeaa) necessarily contain NaeA(Uaea®)? Is Use a(Naea®) 
necessarily contained in Nae A(Uaeaa)? 

1.2.7 The symmetric difference aAb of two sets a and bis the set (a\b)U(b\a). 
Establish the following: 
(a) AAB = (AUB)\ (ANB). 


=i 
=i 
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1.3 Relations and partial orders 


The Cartesian product A x B of two sets A and B is the set of all ordered 
pairs (a,b) with a € A and b € B. More formally, 


Ax B={ax € P(P(AU B)) : there exists a€ A and there exists b € B 
such that x = {{a}, {a, b}}}. 


(The term Cartesian honours René Descartes, who introduced coordinates 
to the plane, so that points in the plane are represented by ordered pairs of 
real numbers; the plane is thus represented as the Cartesian product of two 
copies of the set of real numbers.) 

A relation on A x B is then simply a subset R of A x B. It is customary 
to write aRb if (a,b) € R. The set 


{a€ A: there exists b € B such that (a,b) € R} 
is then called the domain of R, and the set 
{b€ B: there exists a € A such that (a,b) € R} 


is called the range of R. A relation on A x A is called a relation on A. 
Let us give some examples. First, if A is a set then 


Ea= {(b,B) € Ax P(A): be B} 


is a relation on A x P(A). Recall that we introduced the relation € on the 
collection of all sets, which we have seen is not a set; €,4 is the restriction to 
a set and its subsets. 

Secondly, if A is a set then 


Ca= {(B,C) € P(A) x P(A): BCC} 


is a relation on P(A). This is an example of a partial order relation. An order 
<onaset A is a partial order or partial order relation if 

(i) ifa < band b<cthena < c (transitivity), and 

(ii) a < band 6 < aif and only ifa=b. 
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If a < 6 then we say that a is less than or equal to b, or that b is greater 
than or equal to a, and we also write b > a. 

Partial order relations play an important part in analysis. We make some 
definitions concerning partial orders here, and will consider them in more 
detail later. 

Suppose that < is a partial order on a set A, that a € A and that Bisa 
subset of A. 


e ais an upper bound of B if b< a for all b€ B. 
e ais a lower bound of B ifa < b for allbe B. 


An upper bound of B need not belong to B. If it does, it is the greatest 
element of B. B has at most one greatest element, but may have no greatest 
element. Least elements are defined in the same way. 


e aiaa maximal element of B ifa € B,andifb€ Banda< bthena=b. 
e aiaa minimal element of B ifa € B, andifb¢€ Bandb<athena= b. 


A greatest element of B is a maximal element of B, but the converse need 
not hold. 


e ais the supremum, or least upper bound, of B if ais an upper bound of B, 
and if c is an upper bound of B, then a < c. In other words, a is the least 
element of the set of upper bounds of B. 

e ais the infimum, or greatest lower bound, of B if a is a lower bound of B, 
and if cis an lower bound of B, then c < a. In other words, a is the greatest 
element of the set of lower bounds of B. 


B has at most one least upper bound, but may have no least upper bound. 
If a is the least upper bound of B then a may or may not be an element of 
B. If ais an element of B, then a is the least upper bound of B if and only 
if a is the greatest element of B. 

If a < bor b <a then we say that a and b are comparable. In general, not 
all pairs are comparable. If, however, any two elements of A are comparable, 
then we say that the relation is a total order. As an example, the usual order 
on the set of natural numbers N = {1,2,3,...} (which we shall consider in 
Section 2.1) is a total order. 

The definition of the notion of partial order includes equality. There is 
a closely related notion which forbids equality. Suppose that < is a partial 
order relation on a set A. Then the relation 


{(a,b) €§ Ax A:a<banda#b} 
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is a strict partial order on A. It is denoted by < and satisfies 
(i) ifa < band b<cthena < ¢ (transitivity), and 
(ii) a < a does not hold for any a € A. 
Conversely, if < is a strict partial order on A then the relation 


{(a,b) € Ax A:a<bora=b} 


is a partial order. 


Exercises 


1.3.1 Which of the following statements are necessarily true? 
(a) Ax (BUC) =(Ax B)U(AxC). 
(b) (Ax B)U(C x D) = (AUC) x (BUD). 
(c) (Ax B)N(C x D) = (ANC) x (BND). 
1.3.2 Suppose that <, is a partial order on A, and that <9 is a partial order 
on Ag. Show that the relation 


{(a1, a2), (by, b2) — (Ay x Ag) x (Ay x Ag) 2a, <1 by and ag <2 ba} 


is a partial order on A; x Ag. 

1.3.3 Show that a subset of a partially ordered set can have at most one 
greatest element, and at most one supremum. 

1.3.4 This question assumes knowledge of the set N of natural numbers, 
and of counting. Let P(N) be given the partial order defined by inclu- 
sion, as above. Let P,(N) be the set of subsets of N with at most n 
elements. 
(a) What are the upper bounds of P,,(N) in P(N)? 
(b) Does P,(N) have a supremum? If so, is it an element of P,(N)? 
(c) What are the maximal elements of P,(N)? 

1.3.5 Suppose that a is a maximal element of a subset B of a totally ordered 
set A. Show that a is the greatest element of B. 

1.3.6 Give an example of a subset of a totally ordered set which has a 
supremum but no greatest element. 


1.4 Functions 


The notion of function developed slowly from the time of Descartes and 
Leibniz until the end of the nineteenth century. Originally, a function was 
something that was given by an analytic formula, but confusion and dispute 
arose about what this meant, and confusion was also caused by the fact that 


12 The axioms of set theory 


two formulae could give the same values. Here we simply define a function, 
or, synonymously, a mapping, or a map (we shall use the terms interchange- 
ably), from a set A to aset B to bea relation f on A x B which satisfies the 
condition 

for each a € A, there is a unique b € B such that (a,b) € f. 

In these circumstances, we write b = f(a), so that f= {re AxB: 
x = (a, f(a))}. The element f(a) of B is called the image of a under f. 

It is however helpful to consider a function as some sort of dynamic process 
(perhaps taking place in a black box): an element a of A is put in, and f(a) 
comes out: 


a —+| black box |— f(a). 


Thus we write f : A > B for a function from A to B. The set {1 € Ax B: 
x = (a, f(a))} is then called the graph G'y of f. The set of all mappings from 
A to B is denoted by B4; the reason for this notation may become clear later. 

Let us consider some examples. First, suppose that f : A — Bisa function. 
Then we can define a function P(f) : P(A) — P(B) by setting 


P(f)(C) = {x € B: there exists a € C such that f(a) = x}, 


for C a subset of A. It is unfortunately standard practice to denote this 
function by f. This can be misleading; for example, it may happen that 0 € A, 
and that f(@), an element of B, is not the empty set. Then f() 4 0, whereas 
P(f)(0) = 0. In spite of this defect, we shall follow standard practice; with 
caution and common sense, we can avoid the difficulty we have just described. 
Following standard practice, the subset f(C) of B is also called the image of 
C under f. We can also define a function f~! : P(B) — P(A) by setting 


f-'(D) = {we A: f(x) € Dj, 


for D a subset of B. This notation is also unfortunate, as we shall shortly 
see. The set f~!(D) is called the inverse image of D; if b € B then the set 
f—1({b}) is called the inverse image of b. 

Suppose that A is a set. For a € A, define s(a) = {a}; s is a mapping from 
A into P(A). It is an example of an injective mapping. A mapping f : A — B 
is injective, or an injection, or one-one, if distinct elements of A have distinct 
images in B; in other words, if f(a) = f(a’) then a = a’. 

Suppose that B is a subset of a set A. The inclusion map jp : B— A 
is defined by setting jp(b) = b, for b € B. Thus b € B, whereas jp(b) is an 
element of A. jg is again injective. As a special case, when B = A we have 
the identity map i, : A — A defined by setting i4(a) = a for a € A. 
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Let us consider a Cartesian product Ax B, where A and B are non-empty 
sets. For (a,b) € A x B, let m4((a,b)) = a and let mg((a, b)) = b. Then zy is 
a mapping from A x B to A, and 7g is a mapping from A x B to B; they are 
the coordinate projections of Ax B onto A and B, respectively. The elements 
a and b are the coordinates of (a,b). The mappings 7,4 and 7g are examples 
of surjective mappings. A mapping f : A > B is surjective, or a surjection 
or onto, if f(A) = B; every element of B is the image of at least one element 
of A. 

A mapping f : A — B is bijective, or a bijection, or a one-one correspon- 
dence, if it is both injective and surjective; every element b of B is the image 
under f of exactly one element of A. We denote this element by f~!(b); then 
f—' isa bijective mapping of B onto A. We have thus used the term f~! in two 
different senses: if f : A ~ B isa mapping, the mapping f~! : P(B) — P(A) 
is always defined; the mapping f~'! : B — A is only defined when f is 
bijective. Once again, caution and common sense are called for. 

Suppose that (A,<,) and (B,<g) are two partially ordered sets and 
that f : A — Bis a mapping from A to B. The mapping f is said to be 
increasing if f(a) <p f(a’) whenever a <4 a’, and to be strictly increasing if 
f(a) <p f(a’) whenever a <4, a’. It is said to be decreasing if f(a) >B f(a’) 
whenever a <4 a’, and to be strictly decreasing if f(a) >p f(a’) whenever 
a <a, a’. It is said to be monotonic if it is either increasing or decreasing, and 
to be strictly monotonic if it is either strictly increasing or strictly decreasing. 

Suppose that f is a mapping from A to B and that g is a mapping from 
B to C. We can then define the composite mapping go f from A to C by 
setting (go f)(a) = g(f(a)), for a € A. Note the order of the terms: first we 
use the mapping f and then the mapping g, but the terms in the composite 
mapping go f come in the opposite order. 

As examples, if f is a bijection from A onto B then f—!o f : A > A is the 
identity mapping i4 on A, and f o f-!: B — B is the identity mapping ig 
on B. 

The composition of mappings is associative: if f: A— B,g:B—C and 
h:C — D are mappings then 


(ho (go f))(a) = R((go f)(a)) = h(g(f(a))) 
= (ho g)(f(a)) = ((hog)o f)(a), 
so that ho (go f) =(hog)of. 
Suppose that f : A — B is a mapping, and that f(A) C D C B. Then 
Gy CG Ax D, and we can consider f as a mapping from A into D. We 


usually denote this mapping by f, unless this is likely to cause confusion. 
Let us here denote the mapping from A to f(A) by f. Then we have the 
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factorization f = jf(4) © a where f is surjective, and the inclusion mapping 
dp(a) : f(A) — B is injective. 

Next, consider a subset C' of A. Then Gy ™ (C x B) is the graph of a 
mapping from C’ to B. This is the restriction fic of f to C. Ifc € C then 
fic(©) = (0). 

A bijective mapping f : A — Aiscalled a permutation of A. As an example, 
suppose that a and 6 are elements of A. The mapping 7, or 7,4, from A to A 
defined by 


T(a) = b,7(b) = a, T(c) = c for all other c € A, 


the mapping which transposes a and b, is a permutation of A. The set of 
permutations of A is denoted by ¥14. 

We can describe the composition properties of %,4 in algebraic terms. 

A group is a non-empty set, together with a mapping or operation o : 
G x G => G which satisfies: 


(i) composition is associative: that is, (goh)oj = go(hojJ) for g,h,j € G; 
(ii) there exists e € G such that eo g = goe=g, for allg EG; 
(iii) for each g € G there exists g-' € G such that gog-' =g -log=e. 
If 
(iv) gh = hg for all g,h € G, then G is said to be abelian, or commutative. 


Note that the element e is uniquely determined by (ii), for if e also satisfies 
(ii), then e’ = e’ oe = e. The element € is called the identity element of G, 
and is frequently denoted by eg. Similarly, if g € G then the element g~! is 


uniquely determined by (iii); for if go h = e then 


1 1 1 


h=eoh=(g 'og)oh=g° o(goh)=g” oe=g. 


The element g~! is called the inverse of g. 

It then follows immediately from the earlier discussions that “4 is a group, 
when the group composition is taken to be the composition of functions and 
the identity map 7,4 is taken as the identity element. 


Axiom 7: The replacement axiom 


Let us end this section by stating the replacement axiom, since it has a 
function-like quality. A well-formed formula Q(x, y) with free variables x and 
y is said to determine a function if whenever a is a set then there is at most 
one set b for which Q(a, b) holds. If there is a set b for which Q(a, b) holds, then 
we write b = Q(a), and call b the image of a. The replacement axiom then 
states that if Q(z, y) is a well-formed formula which determines a function, 
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and if A is a set, then the collection of all images Q(a), as a varies in A, isa 
set, which we denote by Q(A). Thus 


{x € A: there exists b € Q(A) for which Q(a, b) holds} 


is a subset D4(Q) of A, and Q defines a surjection of D4(Q) onto Q(A). 
We shall not make explicit use of this axiom. 


1.4.1 


1.4.2 


1.4.3 


1.4.4 


1.4.5 


1.4.6 


Exercises 


Suppose that f : A — B, and that C, D are subsets of A and that E, F 
are subsets of B. Which of the following statements are necessarily 
true? 


(a) f(CUD) = f(C)U f(D). 
(b) F(C ND) = f(C) 9 f(D). 
(c) f- (CUD) = f-(C)U f-'(D). 
(d) f-(CN D) = fC) f(D). 


Suppose that f : A — Band g: B— C are mappings. What is the 

graph of go f? Verify that go f is a mapping. 

Suppose that f is a mapping from A to B, where A and B are non- 

empty sets. A mapping !: B — Aisa left inverse of f iflo f = iy, the 

identity on A. Show that if f has a left inverse, then f is injective, and 

that if f is injective, then f has a left inverse. 

Suppose that f is a mapping from A to B, where A and B are non- 

empty sets. A mapping r: B — A isa right inverse of f if for = ig, 

the identity on B. Show that if f has a right inverse, then f is surjective. 

Does a surjective mapping always have a right inverse? Think about 

this, and then read Section 1.9. 

Suppose that f : A > B has a left inverse / and a right inverse r. Show 

that f is a bijection and that 1 =r = f7!. 

This question establishes basic facts about groups that we shall need 

later. A mapping @ from a group G to a group G’ is a homomorphism 

if O(g1 ° g2) = A(g1) © A(g2) for all gi, g2 € G. Suppose that 0: G — G’ 

is a homomorphism. 

(a) Show that 0(e) = e’, where e is the identity element of G, e’ the 
identity element of G’. 

(b) Show that 6(g~+) = (0(g))~! for all g € G. 

(c) Show that if H is a subgroup of G then 0(/) is a subgroup of G’. 

(d) Show that if H’ is a subgroup of G’ then 6~1(H’) is a subgroup 
of G. 
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(e) A bijective homomorphism is called an isomorphism. Show that if 
6 is an isomorphism then 6~! : G’ > G is also an isomorphism. 


1.5 Equivalence relations 


There is another sort of relation that we shall use later. The axiom of exten- 
sionality tells us that two sets are equal if and only if they have the same 
members. There are however many occasions when two different sets serve 
the same purpose, and we would like to identify them in some way. For exam- 
ple, we express a positive rational number as a fraction p/q, where (p,q) is 
an ordered pair of natural numbers. The rational number 1/2 is the same as 
the rational number 3/6, but the ordered pairs (1,2) and (3,6) are different. 
In this circumstance, we say that (1,2) and (3,6) are equivalent. This leads 
to the concept of an equivalence relation. 

An equivalence relation on a set A is a relation on A (frequently, as here, 
denoted by ~) which satisfies 


(i) ifa~ band b~c then a ~ c (transitivity); 
(ii) ifa ~ b then b ~ a (symmetry); 
(iii) a ~ a for all a € A (reflexivity). 


As a trivial example, the relation a = 6 is an equivalence relation. For a 
less trivial example, suppose that f : A — B is a mapping. Let a ~ a’ if 
and only if f(a) = f(a’). Then it is easy to check that ~ is an equivalence 
relation on A. We shall see that any equivalence relation can be expressed in 
this way. 

Suppose that ~ is an equivalence relation on a set A and that a € A. 
We define the equivalence class Ey to be the set {x € A: a~ a}. This is 
the traditional name, but an equivalence class is certainly a set. Note that 
a € Ey, so that E, is a non-empty set. 


Proposition 1.5.1 Suppose that ~ is an equivalence relation on a set A. 
Ifa~a’ then Ey = Ey, and ifa% a’ then Ey and Ey are disjoint. 


Proof Suppose that a ~ a’. If a’ ~ c then a ~ c, by transitivity, and so 
Eq C Eq. Further a’ ~ a, by reflexivity, and so Eg C Eq. 

Suppose that b € Eg Ey. Then a ~ b anda’ ~ b, so that b ~ a’, by 
reflexivity, and a ~ a’, by transitivity. Thus ifa 4 a’, then EygN Ey = 0. 


We now say that a subset FE of A is an equivalence class if there exists 
a € Asuch that E = E,. We denote the set of equivalence classes by A/ ~. 
A/ ~ is a subset of P(A). 
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Corollary 1.5.2 A=Upgea/ FE is a union of disjoint equivalence classes. 


Proof We have seen that distinct equivalence classes are disjoint. Their 


union is A, since ifa € Athena € Ey. 


This leads to the following definition. Suppose that A is a set. A subset IT 
of P(A) is a partition of A if 


(i) each E € IL is non-empty; 
(ii) A= UpenE; 
(iii) distinct elements of II are disjoint. 


Thus if ~ is an equivalence relation on A then the set A/ ~ of equivalence 
classes is a partition of A. 

Let €,4 denote the set of all equivalence relations on A, and let P4 denote 
the set of all partitions of A. We shall show that there is a natural bijection 
of E4 onto Py. If ~E Eg, let k(~) = A/ ~. Then k is a mapping from E, to 
Pa, and it is easy to see that this is injective. Conversely, if II is a partition 
of A, and we set a ~ bif a and b are in the same element of H, then it is easy 
to check that ~ is an equivalence relation on A and that A/ ~= I. Thus k 
is surjective, and the mapping from P,4 to €4 which we have just defined is 
the inverse of k: k is a bijection of E4 onto P4. 

Suppose now that ~ is an equivalence relation on a set A, and that A/ ~ 
is the corresponding partition of A. We define a mapping gq: A — A/ ~ 
by setting g(a) = E,. Then q is a surjection, and a ~ a’ if and only if 
q(a) = q(a’). The set A/ ~ is called the quotient of A by ~, and the mapping 
q: A— A/ ~ is called the quotient mapping. 

Now suppose that f : A — B is a mapping. Define an equivalence relation 
~ by setting a ~ a’ if and only if f(a) = f(a’), and let gq: A — A/ ~ be 
the quotient mapping. If HE = E, € A/ ~ anda’ € E, then f(a) = f(a’). 
We can therefore define f(E) = f(a), and we obtain a well-defined mapping 
f of A/ ~ onto f(A). Suppose that f(E) = f(E’), that a € E and that 
a’ € E’. Then f(a) = f(a’), so that a ~ a’ and E = E’. Thus f is one-one, 
and so f : A/ ~— f(A) is a bijection. We have therefore factorized f as 
f = js(a) ° f° g, where q is a surjection, f is a bijection and the inclusion 
mapping J f(A) : f(A) — B is injective. Thus we have the following diagram 
of mappings: 
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This diagram is commutative: the outcome of the direct journey from A to 
B is the same as the outcome of the longer journey going round the other 
three sides of the diagram. 


Exercises 


1.5.1 Suppose that o is a permutation of a non-empty set A. A subset B of 
A is o-invariant if o(B) = B. Ifa € A let 


Og =O{B € P(A): a € B and B is o-invariant}. 


(a) Show that O, is o-invariant. 

(b) Suppose that O, N O; 4 0. Show that O, = Os. (Hint: Consider 
Oa \ Op.) 

(c) A subset O of A is an orbit of o if there exists a € A such that 
O = Og. Show that the set of orbits is a partition of A. What is 
the corresponding equivalence relation? 

1.5.2 A subgroup H of a group G is a subset of G with the properties 
(i) the identity of G belongs to H; 

(ii) ifh € Hthenh"! € H; 

(iii) if h and h’ are in H then hoh' € H. 

Thus H is a group with the operations inherited from G. 

Suppose now that H isasubgroup of / 4. A subset B of Ais H-invariant 

if o(B) = B for each o € H. Carry out a programme similar to that of 

the previous question. 


1.6 Some theorems of set theory 


Although we have only met some of the axioms of ZF, we are already in a 
position to prove some interesting and important results. 


Theorem 1.6.1 (The Knaster—Tarski fixed-point theorem) Suppose that 
A is a set and that f : P(A) — P(A) is an increasing function; if BC CCA 
then f(B) C f(C). Then there exists GC A such that f(G) =G. 


Proof Note that f is defined as a mapping from P(A) to itself: it is not 
defined in terms of a mapping from A to itself. Thus @ C f(@) and A D f(A); 
the inclusions change direction. The theorem states that equality holds at 
some intermediate subset. 

We shall show that there exists a set G such that G C f(G) and f(G) C G; 
the axiom of extensionality then ensures that G = f(G). 

Let G = {B € P(A): BC f(B)}, and let G = UpegB. If BEG 
then B C G, and so f(B) C f(G). Thus B C f(B) C f(G). Consequently 
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G = UpegB C f(G), and so G € G. On the other hand, since G C f(G) 
it follows that f(G) C f(f(G)), and so f(G) € G. Thus f(G) C UpegB = 
G. 


Theorem 1.6.2 (The Schréder—Bernstein theorem) Suppose that A and 
B are sets, and that f: A— Bandg: B— A are injective mappings. Then 
there exists a bijectionh: A— B. 


Proof The existence of f says that ‘A is no bigger than B’ and the existence 
of g says that ‘B is no bigger than A’. The conclusion then is that if both hold 
then ‘A and B are the same size’. We shall consider the problem of whether 
two sets are always comparable in size later (Theorem 1.9.2). 

We consider the mappings f : P(A) — P(B) and g: P(B) — P(A) 
determined by f and g; they are clearly increasing maps. On the other hand 
the mapping C4 : P(A) — P(A) defined by C4(D) = A\Dis order reversing, 
as is the corresponding mapping Cg : P(B) — P(B). Thus the composite 
mapping S = C4ogoCgo f is an increasing mapping from P(A) into 
itself. The Knaster—Tarski fixed-point theorem then tells us that there exists 
D C Asuch that S(D) = D; the restriction f\p of f to D is a bijection of D 
onto f(D). Let E = f(D), so that Ca(f(D)) = B \ E. Thus 


A\ D=Ca(D) = Ca(S(D)) = Ca(CagCaf(D)) 
= 9(Cef(D))) = g(B \ £). 
Consequently the restriction g)p\ 5 of g to B \ E is a bijection of B \ E onto 
A\ D; let k : A\ D — B\ E be its inverse. We now set h(a) = fip(a) 


for a € D, and set h(a) = k(a) for a € A \ D; h clearly has the required 
properties. 


S(@) = F; S(F) = FUH; S(A) = FUHUI 


Figure 1.6. The Schroder—Bernstein theorem. 


20 The axioms of set theory 
The next result uses the argument of Russell’s paradox. 


Theorem 1.6.3 (Cantor’s theorem) Suppose that f is a mapping from a 
set A to its power set P(A). Then f is not surjective. 


Proof Let B={ae A:a¢ f(a)}. We claim that B is not in the image of 
f. Suppose not, and suppose that B = f(b). Does b belong to B? If it does, it 
fails the criterion for membership of B, giving a contradiction. If it does not, 
then it meets the criterion for membership of B, again giving a contradiction. 
This exhausts the possibilities, and so B is not in the image of f. 


Corollary 1.6.4 Suppose that A is a non-empty set and that g: P(A) > 
A is a mapping. Then g is not injective. 


Proof The mapping s : A — P(A) defined by s(a) = {a} is injective. If 
g were injective, then by the Schréder—Bernstein theorem there would be a 
bijection h : A — P(A), which contradicts the theorem. 


1.7 The foundation axiom and the axiom of infinity 


Suppose we start with the empty set. Repeatedly using the axioms that we 
have described so far to create new sets, we obtain an infinite collection of 
sets which satisfy these axioms. But each of these sets has only finitely many 
members. This may be satisfactory for certain areas of mathematics, such as 
finite group theory, or the mathematics of computer science, but in mathe- 
matical analysis we need to consider sets with infinitely many members. We 
now introduce two further axioms which enable us to do so. 


Axiom 8: The foundation axiom 
This states that if A is a non-empty set, then there exists an element a of A 
such that aN A = @: a and A have no element in common. As we shall see, 
this excludes the possibility of infinite regress. It also prevents us from going 
round in circles. 


Proposition 1.7.1 Ifa is a set thena €a. 


Proof Consider the singleton set {a}. It has a member disjoint from {a}. 
But it only has one member, namely a, and so a and {a} are disjoint. Since 
a € {a}, a ¢a. Russell’s paradox has completely disappeared. 


Let us introduce a construction that will shortly be useful to us. If a is a 
set, we define at to be the set aU {a}. The members of a* are the members 
of a, together with a. Thus a C at. 
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Corollary 1.7.2 Ifa is a set thena#a. 


Proof Foraé€at anda €a. 


Here is another consequence of the foundation axiom. 
Proposition 1.7.3 Ifa and b are sets anda € b thenb €a. 


Proof Consider the set {a,b} with elements a and b. By the foundation 
axiom, either aM {a,b} = 0 or 6M {a,b} = @. But a € bN {a,b}, and so 
af {a,b} =. Since b € {a,b}, b Za. 


A set A is called a successor set if @ € A and if at € A whenever a € A. 


Axiom 9: The axiom of infinity 
This states that there exists a set S which is a successor set. 


Having postulated the existence of a successor set, we now show that there 
is a smallest one. 


Theorem 1.7.4 There exists a successor set Z* such that if T is any 
successor set then Z* CT. 


Proof Note that if A is a set, all of whose elements are successor sets, then 
it follows immediately from the definitions that the intersection Ngc,B is 
also a successor set. Suppose that S is a successor set. Let 


Z*+ =n{B € P(S): B is a successor set}. 


Then if T is a successor set, J. S is a successor set, so that Zt C 
TAS CT. 


The minimality of Z* is very powerful, and leads to the principle of 
induction. 
Let us use the foundation axiom to show that infinite regress is not allowed. 


Proposition 1.7.5 Suppose that f : Z* — A is a mapping. Then there 
exists n € Z* such that f(nt) ¢ f(n). 


Proof Consider the set f(Z*). By the foundation axiom, there exists 
n € Z* such that no member of f(n) is in f(Z*). But f(nt) € f(Z*), 
and so f(n*) ¢ f(n). 


We now show that we can take the minimal successor set Z* as a model 


for the natural numbers. Let us explain what this means. In 1888, Dedekind 
described an axiom system for the natural numbers N = (1, 2,3,...). Inde- 
pendently, Peano introduced them, in a pamphlet written in Latin. They are 
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now known as Peano’s axioms. Replacing 1 by 0, they also serve as axioms for 
the non-negative integers Z* = (0,1,2,...); in this form, and in set-theoretic 
terms, they state the following. There is a set P and a mapping s: P — P 
(the successor function) such that 


(P1) there is a distinguished element 0 of P; 

(P2) if n € P then s(n) € P (this is included in the fact that s is a mapping 
from P to itself); 

(P3) ifn € P then s(n) 4 0; 

(P4) s is injective: ifm € P and n€ P and s(m) = s(n) then m= n; 

(P5) (the principle of induction) if A C P, if 0 € A and if s(A) C Athen A = P. 


We set s(0) = 1, s(1) = 2, and so on. 

There are many ways of constructing a pair (P,s) which satisfies these 
axioms. Any pair (P,s) which does so is called a model for the non-negative 
integers Z*. 


Theorem 1.7.6 Jf n € Zt, let s(n) = n*. Then the pair (Z*,s) is a 
model for Z*. 


Proof For (P1), we take the empty set @ to be the distinguished element. 
Ifn € Zt then n* € Zt, so that (P2) holds. Since n € nt, s(n) 4 0, so 
that (P3) holds. Suppose that m* = nt, and that m 4 n. Then m € m* = 
nt =nU{n}. Sincem 4 n,m ¢ {n}. Thus m € n. Similarly, exchanging the 
roles of m and n, n € m, contradicting Proposition 1.7.3. Thus (P4) holds. 
Finally, (P5) follows from Theorem 1.7.4. 


As we have remarked, there are many other ways of constructing pairs 
(P, s) for which the Peano axioms hold. We need to show that any two are 
essentially the same, but we must wait until the results of the next section 
have been established before we can do this. 

The principle of induction allows us to prove results relating to the non- 
negative integers. Suppose that Q() is a well-formed formula and that we 
are interested in the subset T of P consisting of those n for which Q(n) holds. 
Suppose that we can prove that 0 € 7’, and that we can also prove that if 
Q(n) holds then it follows that Q(s(n)) holds. Then T satisfies the conditions 
of (P5), and so T = P; P(n) holds for all n € P. A proof which uses this 
procedure is known as a proof by induction. We shall give many such proofs. 
Here is one. 


Proposition 1.7.7 Suppose that (P,s) satisfy the Peano axioms, with 
distinguished element 0. Then s(P) = P \ {0}, ands: P > s(P) is a 
bijection. 
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Proof The mapping s: P — s(P) is a bijection, by (P4), and 0 ¢ s(P), by 
(P3). Let A = {0} Us(P). Then 0 € A, and ifn € A then s(n) € A, so that 
by (P5), A= P. 


Exercises 


1.7.1 Consider the two-point subset {0,1} of Z*. If A is a set, we denote the 
set of functions from A to {0,1} by 24. Suppose that B € P(A) and 
x € A. Let Ip(x) = 1 if x € B, and Ip(x) = O if x ¢ B. Ig is the 
indicator function of B. Show that the mapping B — Ig : P(A) > 24 
is a bijection. 


1.8 Sequences, and recursion 


In this section, we shall assume that (P,s) is a model for Z*, with distin- 
guished element 0. We write P as (0,1,2,...), where 1 = s(0), 2 = s(1), and 
so on. A function f : P > Ais then called a sequence, or an infinite sequence 
in A, and is denoted by (fn)nep, or by (fn)?P2po, or as (fo, fi, fo,...). This 
notation suggests another way of considering a function: the elements of P 
act as labels or indices. Since f need not be one-one, an element of A may 
have more than one label. Since f need not be surjective, some elements of 
A may have no labels. 
It is important to distinguish the sequence (fp)nep from its set of values 


f(P) = {a € A: there exists n € P such that x = fn}, 


but some flexibility is needed. When we consider a term f, of a sequence, 
we may consider f, as the value of the sequence at n, but at the same time 
keep in mind its index or label n. For sequences, as for fashion, the label is 
as important as the object. 

The principle of induction lets us prove results about sequences. Recursion 
allows us to construct sequences. 


Theorem 1.8.1 (The recursion theorem) Suppose that A is a non-empty 
set, that f is a mapping of A to itself and that a € A. Then there is a unique 
sequence (dn)nep such that aj = G@ and agin) = f(an) for n€ P. 


Proof Recall that a sequence is a function from P to A, that a function is a 
relation satisfying certain conditions, and that a relation is a subset of P x A. 
Let us consider the set of relations on P x A. We say that a relation R is 
recursive if 

(i) ORa and 

(ii) if nRa then s(n)Rf (a). 
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The set S of all recursive relations is non-empty, since P x A € S. Let 
g = Ores. We shall show that g is a function, that g(0) = @ and that 
g(s(n)) = f(n) for all n € P. Thus a, = g(n) satisfies the conditions of the 
theorem. 
Let 
D(g) = {n€ P: there exists a € A with (n,a) € g} 


be the domain of g. We must show that D(g) = P; we prove this by induction. 
Since (0,@) € R for all R € S, S(0,a) € g. Thus 0 € D(g). If n € D(g), there 
exists a such that (n,a) € g, and so (n,a) € R for all R € S. Then since each 
R€ Sis recursive, (s(n), f(a)) € R for all R € S, and so (s(n), f(a)) € g. 
Thus s(n) € D(g). By the induction principle, it follows that D(g) = P. 

Next, we must show that ifn € P then there exists exactly one a € A such 
that (n,a) € g. Again, we prove this by induction. Let 


U={neé P: if (n,a) € P and (n,a’) € Pthena=a’'}. 


First, we show that 0 € U. (0,a) € g. Suppose that (0,a’) € g and that 
a’ #a. Let g' = g \ {(0,a’)}. Then (0,a) € g’, since a’ 4 a. If (n,a) Eg’ Cg 
then (s(n), f(a)) € g, and (s(n), f(a)) # (0,a’), since s(n) 4 0, so that 
(s(n), f(a)) € g’. Thus g’ € S, and so g C g’, giving a contradiction. 

Secondly, we show that if n € U then s(n) € U. Suppose not. There 
exists a unique a € A such that (n,a) € g, and so (s(n), f(a)) € g. Since 
s(n) ¢ U, there exists a’ € A with a’ # f(a) such that (s(n),a’) € g. Let 
g' = 9\{(s(n), a’)}. We shall show that g’ € S. As before, (0, a) € g’. Suppose 
that (m,b) € g’ C g. Then (s(m), f(b)) € g. Thus if (s(m), f(b)) ¢ g then 
(s(m), f(b)) = (s(n),a’). But then m = n and f(b) = a’. Since n € U, 
(m,b) = (n,a), and so that b = a. Thus f(a) = a’, giving a contradiction. 
By the principle of induction, U = P, and so g is a function. 

Finally, we show that g is unique. Once again, we prove this by induction. 
Suppose that g’ is a function in S. Let G = {n € P: g(n) = g/(n)}. Since 
g'(0) = g(0) = a, 0 € G. Suppose that n € G. Then g’(s(n)) = f(g/(n)) = 
f(g(n)) = g(s(n)), so that s(n) € G. By the induction principle, G = P, so 
thst o= 7. 


Ifn € Zt anda € A, let f"(@) = an. Then f” is a mapping of A into 
itself. We can therefore express the recursion theorem in the following way. 


Theorem 1.8.2 Suppose that A is a non-empty set and that f is a 
mapping of A to itself. For each n € Z* there exists a unique map- 
ping f" : A — A such that f°(a) = a for alla € A and such that 
f° (a) = f(f"(a)) for alln € Z* anda€ A. 
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The recursion principle can be extended to more complicated situations. 
See Exercise 1.1.1 below. 

We can now show that any two models of Zt have exactly the same 
properties. 


Theorem 1.8.3 Suppose that (P,s) and (P’,s') satisfy the Peano axioms, 
with distinguished elements 0 and 0! respectively. Then there is a unique 
bijection t: P + P’ with t(0) = 0' and s‘t(n) = ts(n) for each n € P. Thus 
we have the diagram: 


Qe Se Al ee re ee ee ES A 
t | t | cl t | t | 
Oh ee ct a De a Ste yt Se iG) ee 


Proof Set t(0) = 0’, and apply recursion to the mapping s’. There is then 
a unique mapping t : P — P’ such that ts(n) = s/(t(n)) for each n € P. 
Similarly, there is a unique mapping t’/ : P’ — P such that t/(0’) = 0 and 
t's'(n') = st'(n’). We shall show by induction that ¢’t is the identity on P 
and that tt’ is the identity on P’, so that t is a bijection. Let U = {ne P: 
t't(n) = n}. Since t’/t(0) = t’(0’) = 0, 0 € U. Suppose that n € U. Then 


s(n) = st’t(n) = t's't(n) = t'ts(n), 


so that s(n) € U. Thus U = P. Exchanging the roles of P and P’, we also 
see that tt’ is the identity on P’. 


From now on, we take the non-negative integers to be a set Zt = 
{0,1,2,...}, together with a map s : Zt — Zt, such that the pair (Z™, s) 
satisfies the Peano axioms, and take the natural numbers N = {1,2,3,...} to 
be the set s(Z*). We could, for example, take (Zt, s) to be the pair (Z*,* ). 
Properties of Zt and N will however be derived from the Peano axioms, and 
not from any particular set-theoretical properties that the model might have. 


Exercises 


1.8.1 Suppose that (An)nez+ is a sequence of non-empty subsets of a set A, 
and that for each n € Zt, fp is a mapping from A, into As(n)- Show 


26 The axioms of set theory 


that if @ € Ao then there exists a unique sequence (@n)nez+ in A such 
that dn € An for n € Zt, a9 = A and ag) = fn(an) for n € Zt. 
[Hint: Let 

D={xEZt x A:a2=(n,a) witha € An}, 


and consider the mapping ¢ : D — D defined by ¢(n,a) = 

(s(n), fa(a)). 

1.8.2 Suppose that A is a non-empty set and that S : P(A) — P(A) is an 
increasing function. 

(a) Use recursion to show that there are sequences (Hp)ncz+ and 
(Jn)nez+ in P(A) such that Hp = 0 and S(H,) = Hon) forn € ZT, 
and Jo = Aand S(Jn) = Ign) for n € ZF. 

(b) Show that (Hn)nez+ is an increasing sequence and that (Jn)nez+ 
is a decreasing sequence. 

(c) Let H = UH, and J = N° ,J,. Show that if G ¢ P(A) and 
G=S(G)then HCGCJ. 

(d) Give examples where H 4 S(H) and J 4 S(J). 

(e) Let S be the mapping defined in the proof of the Schréder— 
Bernstein theorem. Show that S(H) = H and S(J) = J. 


1.9 The axiom of choice 


We have seen that a sequence, that is, a mapping from Z* to a set B, can 
be considered as a way of labelling elements of B. We can extend this idea to 
other mappings; if f is a mapping from a set A to a set B, we can consider 
A as an index set, used to label those elements of B which are in the image 
f(A). In this case, we denote the function by (fa)aea, and call it a family 
of elements of B, indexed by A. Once again, f need not be injective, and so 
there may be distinct a and a’ for which fy = fy. 

Suppose now that (Ba)aea is a family of non-empty sets. The Cartesian 
product ||,<4(Ba) is then defined to be the set of all families (ca)aca with 
values in Ugce4Ba, such that cg € Ba for each a € A. If G € A, then the 
mapping 7g defined by mg(c) = cg for ¢ = (ca)aca in [ye 4(Ba) is the 
coordinate projection of [],-4(Ba) into Bg: mg(e) = cg is the G-th 
coordinate of c. 

The question then arises: does [[,<, have any members? At first glance, it 
appears that it must; since each By is non-empty, there exists cg in Ba, and 
we can take (cq)aeA as an element of [[,-4. The problem is that we must 
do this simultaneously, for all a € A. We require a further axiom to say that 
this is valid. 
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Axiom 10: The axiom of choice 


This states that if (Ba)aca is a family of non-empty sets then [],<4(Ba) is 
non-empty: there exists a function c, a choice function, from A to UgcaBa, 
such that cq = c(a) € Ba for each a € A. 

The axiom of choice has a particular position in axiomatic set theory, which 
we shall discuss further in the next section. On the one hand, the way that 
we have presented it makes it seem plausible. On the other hand, there is 
no procedure for producing a choice function, so that its use is highly non- 
constructive. Further, the axiom of choice leads to some conclusions that 
seem bizarre. A famous example is the Banach—Tarski paradox, which says 
that a solid ball B in three dimensions can be divided into a finite number of 
disjoint sets, which can be rearranged, by rotation and translation, into two 
disjoint copies of B. 

Even when (B,)nez+ is a sequence of non-empty sets, we require a ver- 
sion of the axiom of choice to ensure that there is a sequence (Cn)ncezt 
in Unez+(Bn) with c, € By for all n € Zr. Restricting the axiom of 
choice to sequences, we obtain the countable axriom of choice; this cer- 
tainly seems plausible, and we shall accept it, and use it, generally without 
comment. 

Although recursion enables us to construct sequences, it requires the use 
of a given function f. Let us consider a more general situation. Suppose 
that A is a non-empty set, and that ¢@ is a mapping from A into the set 
P(A) \ {0} of non-empty sets of A. Suppose that a@ € A. Does there exist 
a sequence (@y)ncp such that ag = G@ and agn) € (an), for n € P? At 
stage n, we choose a(n) from the set (a,). The axiom of dependent choice 
states that this is always possible. It is an easy consequence of the axiom 
of choice, and implies the countable axiom of choice, but is not equivalent 
to either of them. Again, we shall accept it, and use it, generally without 
comment. 

In the general situation, though, we will state explicitly when we use the 
axiom of choice or use Zorn’s lemma. Zorn’s lemma is an axiom equivalent 
to the axiom of choice, and is particularly useful in analysis. 

Zorn’s lemma concerns partially ordered sets, and we need to make a fur- 
ther definition in order to formulate it. Suppose that (A, <) is a partially 
ordered set. A subset C' is a chain if it is totally ordered under the order 
inherited from the partial order on A; that is, if c and ¢ are elements of C’ 
then either c < c! or d <c. 

Zorn’s lemma then states that if (A, <) is a partially ordered set in which 
each chain has an upper bound, then A has a maximal element. 
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Zorn’s lemma implies the axiom of choice, and the axiom of choice implies 
Zorn’s lemma. We shall prove the former statement here. The proof of the 
converse is long and technical; we give the details in Appendix A. 


Theorem 1.9.1 The axiom of choice is a consequence of Zorn’s lemma. 


Proof Suppose that (Ba)aea is a non-empty family of non-empty sets. We 
consider the set F of all pairs (A,c), where A is a subset of A, and c: A > 
Usea Bs is a choice function. F is certainly not empty: if A = {6}, there 
exists b € Bs, and we define c(d) = b. We give F a partial order by setting 
(A,c) < (A’,c) if A C A’ and c(6) = c(6) for all 6 € A. (This way of 
ordering a set of ordered pairs (X, f), where X is a subset of a set 2 and f 
is a mapping from X to a set Y, is typical of the way that Zorn’s lemma is 
used.) Suppose that C is a chain in E. Let 


Ac = {6 € A: 6€A for some (A, c) in C}. 


If 6 € Ac then 6 € A for some (A,c) in C. Let co(d) = c(d). Since C is a 
chain, if 6 € A’ for some other (A’,c’) in C then c(5) = c'(d), so that cc is 
well defined (it does not matter which pair we choose). Further, (Ac, cc) is 
an upper bound for C. 

We now apply Zorn’s lemma to deduce that there is a maximal element 
(Am;Cm) in E. We claim that A,, = A, so that c,, is a choice function on 
A. Suppose not. Then there exists a € A \ A,, and there exists ba. € Ba. 
Let A = AmU {a}, let (6) = em(d) for 6 € Am, and let ¢(a) = ba. Then 
(A,c) € EF, (Am,em) < (A,©) and (An,cm) 4 (A,¢c), contradicting the 


maximality of (Am, Cm). 


Let us give another application of Zorn’s lemma, to obtain a result which 
complements the Schroder—Bernstein theorem. 


Theorem 1.9.2 Suppose that A and B are non-empty sets. Then either 
there exists an injective mapping j : A — B or there exists an injective 
mapping k: B— A. 


Proof Let FE be the set of ordered pairs (H,h), where H is a subset of A, and 
h is an injective mapping of H into B. We order E by setting (H,h) < (H’,h’) 
if H C H’ and h'(a) = h(a) for a € H. Suppose that C is a chain in FE. As 
above, we set 


Ho ={a€ A:a€H for some (H,h) in C}. 
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If a € He then a € A for some (H,h) in C. Let ho(a) = c(a). Arguing as 
above, hc is well-defined. We must check that it is injective. If a and a’ are 
distinct elements of Ho, then a € H for some (H,h) € C anda’ € H for 
some (H’,h’) € C. Since C is a chain, either H C H’' or H’ C H. Suppose 
that H C H’. Then a € H’, and so hc(a) = h’(a) 4 h'(a’) = hc(a’). A 
similar argument holds if H’ C H. 

We now apply Zorn’s lemma to deduce that there is a maximal element 
(Hm, hm) of E. If Hm = A, we are finished. Suppose that H,, #4 A. Then 
we claim that hm(Hm) = B. For if not, there exist @ €¢ A\ Hm and be 
B\ hm(Hm). Let H = Hm U {a} and define h: 7 — B by setting h(a = 
hm(a) fora € Hm and h(a) = b. Then (H,h) € E, (Hm,hm) < (H,h) and 
(Hips tin) (A, h), contradicting the maximality at (Filia le EUS five 18 3 
bijective mapping of H,,, onto B, and we can take k to be the inverse mapping 
hot. 


Exercises 


1.10.1 Show that the axiom of choice implies the axiom of dependent choice. 

1.10.2 Show that the axiom of dependent choice implies the countable axiom 
of choice. 

1.10.3 Suppose that (¢,,)°29 is a sequence of non-empty subsets of Zt. Use 
recursion to show that there exists a sequence (f;,)°°_, in Z* such that 
fo € do and fr4i € (fn), for n € Z*. Why is the axiom of dependent 
choice not needed? 

1.10.4 Suppose that (A, <4) and (B, <p) are partially ordered sets. Define 
a relation <; on A x B by setting (a,b) <; (a’, 0’) if either a <4 a’ or 
a=a' andb<v0’. 

(a) Show that <, is a partial order on A x B (the lexicographic order). 
(b) Show that if <4 and <p are total orders then so is <;. 

1.10.5 Prove the following variant of Zorn’s lemma. Suppose that (A, <) sat- 
isfies the conditions of Zorn’s lemma, and that C is a chain in A. 
Show that there is a maximal element m of A such that c < m for all 
ced, 


1.10 Concluding remarks 


We have now described the set-theoretical foundations on which we shall 
build mathematical analysis. In the process, we have constructed a model for 
the non-negative integers, which satisfies the requirements of Peano’s axioms. 
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How sound are these foundations? Are they consistent, or is it possible that 
they may lead to a contradiction? Are they adequate, or are there problems 
which we are unable to solve using them? 

In order to discuss these questions, it is helpful to put them in a histori- 
cal context. The idea of developing mathematics from a collection of axioms 
goes back to Euclid’s Elements of the third century BC. Euclid gave five 
postulates, or axioms, from which he deduced geometric theorems. The fifth 
postulate, the parallel postulate, states, in the essentially equivalent form of 
Playfair’s axiom, that given a straight line in the plane, and a point not on 
it, there exists a unique straight line in the plane which passes through the 
given point, and which does not meet the given line. This postulate raised 
particular interest, since it was felt that it should be possible to deduce it 
from the other postulates, and many unsuccessful attempts were made to 
do so. In the early part of the nineteenth century, Gauss, Janos Bolyai and 
Lobachevsky all developed the theory of non-Euclidean geometry (where the 
parallel postulate fails), but it was not until 1868 that Beltrami produced a 
model of a two-dimensional non-Euclidean geometry in the setting of three- 
dimensional Euclidean space, showing that the parallel postulate cannot be 
deduced from the other postulates. All this raised interest in the axioms, and 
in particular interest in their consistency. Hilbert studied this in detail and 
observed that if the postulates of Euclidean geometry are not consistent, then 
neither are Peano’s axioms. He came to believe that there should be a consis- 
tent set of axioms for mathematics, from which all results could be deduced. 
In his famous address to the International Congress of Mathematicians in 
Paris in 1900, in which he set out his twenty-three important problems for 
the twentieth century, he talked of 


the conviction (which every mathematician shares, but which no- 
one has yet supported by a proof) that every definite mathematical 
problem must necessarily be susceptible of an exact settlement, 
either in the form of an actual answer to the problem posed or 
by the proof of the impossibility of solution and therewith the 
necessary failure of all attempts. 


Here he clearly had in mind the necessary failure to prove the parallel 
postulate from the other postulates. Later on, he said 

This conviction of the solvability of every mathematical problem is 

a powerful incentive to the worker. We hear within us the perpetual 


call: There is the problem. Seek its solution. You can find it by pure 
reason, for in mathematics there is no ignorabimus. 


This optimism was overturned by Godel in a spectacular way in 1930 and 
1931. First came his incompleteness theorem, which showed that within any 
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logical theory (satisfying certain technical conditions, which reasonably can 
be expected to hold in a worthwhile theory) there are statements which can- 
not be proved, and whose negation cannot be proved. Not every mathematical 
problem is susceptible of an exact solution. Alas, ignorabimus! Next came his 
inconsistency theorem: if a proof of consistency can be given within the the- 
ory, then necessarily a proof of inconsistency can also be given. For a system 
to be consistent, it must be impossible to prove its consistency. 

Where does this leave ZF? First, and we shall illustrate this in a moment, 
the axioms of ZF cannot be the axioms for all of mathematics, nor can we 
add to them to obtain a set of axioms for all mathematics. Secondly, they 
cannot be proved to be consistent. Nevertheless they have stood the test of 
time, and so provide us with a valuable starting point. It is interesting to 
speculate what would happen if an inconsistency were found. Mathematics 
would not collapse: mathematicians would continue their work, turning to 
their logician colleagues to produce a better set of axioms. The effect on the 
mathematical analysis that we shall be considering would be negligible. 

What about the axiom of choice? In 1938, Gddel showed that if ZF is 
consistent, then so is the system obtained by adding the axiom of choice. 
In 1963, Cohen showed that there are models of ZF in which the axiom of 
choice does not hold. Thus the axiom of choice is independent of the axioms 
of ZF, and cannot be proved or disproved, starting from ZF. We can add 
the axiom of choice to obtain a stronger axiom system ZFC. Within this, 
there are further statements that cannot be proved or disproved, such as the 
continuum hypothesis (which states that if A is an uncountable subset of the 
set R of real numbers, then there exists a bijection f : A > R). In fact, we 
shall adopt the axiom of choice, but will use it as sparingly as possible. 


2 


Number systems 


2.1 The non-negative integers and the natural numbers 


In this chapter we study various number systems. We begin by developing 
the familiar properties of the non-negative integers Zt = (0,1,2,...) and 
the natural numbers N = (1,2,3,...), using the Peano axioms, induction 
and recursion. 

We begin with addition. This is defined by repeatedly adding 1; we use 
recursion to formalize this. Suppose that m € Z*. Considering the mapping 
s:Z* — Z, setting mo = m, and using recursion, we see that there is a 
sequence (Mn)nez+ such that m9 = m and Msn) = $(™Mn). We call my the 
sum of m and n, and denote it by m+n. Thism+0 = mand m+1 = s(m). 
The equation m(n) = 8(™n) becomes 


m+(n+1)=(m+n) +1. (*) 
Here are the fundamental results about addition. 


Theorem 2.1.1 Suppose that m,n,p € Zr. 


i) m+n=n+m (commutativity) 
ti) (m+n) +p=m-+ (n+p) (associativity) 
iti) ifm+n=ptnthenm=p (cancellation) 
iv) ifm+n=0 thenm=n=0. 


( 
( 
( 
( 


Proof The proof uses induction many times over. 


(i) We prove this in three steps. First we show that m +0 =0+ m for all 
m. We use induction. Let U = {m € Z* :0+m=m+0}. Then0€U, 
since 0 + 0 = 0+ 0. Suppose that m € U. Then (m+1)+0=m-+1, 
and 0+ (m+ 1) = (0+ m) +1, by (*), and (0+ m) +1=m-+1. Thus 
m+1€U, and so U = Z", by induction. 
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Next, we show that 
(m+1)+n=(m+4+n) +1 for all m,n € Zt. (t) 
Again, we use induction. Let 


V={neZt:(m+1)+n=(mt+n)4+1 forall me Z*}. 


Since (m+ 1) +0=m+1=(m+0)+1,0€ V. Suppose that n € V. 


Then 
(m+1)+(n+1) = ((m+1)+n)+1,_ by (*), 
=((m+n)+1)4+1, sinceneV, 
=(m+(n+1))4+1, by (*) again. 


Thus n+ 1€V, and V = Z", by induction. 
Finally we establish (i), using induction once more. Let 


W={neZt:m+n=n4+m forall me Z*}. 
Then 0 € W, by the first step. Suppose that n € W. Then 
m+(n+1) =(m4+n) +1, by (*), 
=(n+m)+1, sincene W, 
=(n+1)+™m, by (f). 


Thus n +1 € W, and W = Z", by induction. 
(ii) Induction once more. Let 


X={peZt:(m+n)+p=m+4+(n+p) forallmne Zt. 


Since (n+n)+0=m+n=m+4+(n+0), 0 € X. Suppose that p € X. 


Then 
(m+n) +(p+1)=((m+n)+p)+1, by (*), 
=(m+(n+p))+1, sincepe X, 
=m-+((n+p)+1), by (*), 
=m+(n+(p+1)), by (*) again. 


Thus p+ 1 € X, and X = Z", by induction. 
(iii) A final use of induction. Let 


Y={neZt :ifm+n=p4+nthen m= p, for all m,p€ Zt}. 
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Since m+0= mand p+0= p,0 € Y. Suppose that n € Y and that 
m+(n+1)=p+(n+1). Then 


(m+n)+l=m+(n+1l)=pt+(n+1)=(pt+n)+1, by (), 


and som+n=p-+n, by the Peano axiom (P4). Since n € Y, it follows 
that m =p, andson+1¢€/Y. Thus Y = Z’, by induction. 

Suppose that m+n = 0 and that n 4 0. Then n € N = s(Z*), and 
son = p+1 for some p € Zt. Then 0 = m+(p+1) =(m+p)+1, 
by (ii). This contradicts the Peano axiom (P3). Thus m = 0, and so 
n=n+0=04+n=0. 


ae 


(iv 


As a result of (ii), we can write (m+n)+p=m+(n+p)=m+n+p, 
omitting the brackets. 

By now, proof by induction should be familiar! In future, the details of 
many such proofs will be left to the reader. 

We now define multiplication recursively. Suppose that n € Z*. Using 
recursion, we see that there exists a sequence (Pm)mez+ such that pp = 0 and 
Pm+1 = Pm +n. We then set pm = m.n (or mn, if this causes no confusion). 
The number mn is the product of m and n. Arguing as in Theorem 2.1.1, we 
obtain the following. 


Theorem 2.1.2 Suppose that m,n,p € Zr. 


(i) mn=n.m (commutativity); 
(i) Vn=0 ondla—n; 

(iti) (m.n).p = m.(n.p) (associativity); 
(iv) ifmn=p.n andn #0 thenm=p_ (cancellation); 
(vo) afma=0 then m=0 orn —2. 


Proof The proofs, by induction, are left as exercises for the reader. 


Again, we can write (mn)p = m(np) = mnp, omitting the brackets. We 
also connect addition and multiplication. 


Theorem 2.1.3 Suppose that m,n,p € Zt. Then m(n + p) = 
(m.n) + (m.p) (the distributive law). 


Proof The proof, by induction, is again left as an exercise for the reader. 


We write (m.n)+(m.p) = mn+mep: multiplication is done before addition. 
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Corollary 2.1.4 Suppose that m,n € Zr. 
(i) df ma =n then n= 0 or m= 1. 
(a) Ifmai=1 then m= n= 1, 


Proof We use results from Theorems 2.1.1 and 2.1.2, without comment. 
Decide which results are used at each stage of the arguments. 

(i) Ifn #0 then m # 0, and so there exists k € Zt such that m= k +1. 
Then 0+-n=n=mn= (k+1)n=kn+n, so that kn = 0, by cancellation. 
Sincen £0,k =O andm= 1. 

(ii) m 4 0 and n ¥ 0, so that there exist k,l € Zt such that m=k+1 
andn =1+4+1. Then 1 = mn = (K+ 1)(041) = kl+k44+141, so that 
ki+ (k+2) =0. Thusk+1=0Oandk =1=0. Thusm=n=1. 


We now use addition to define an order relation on Z*. If m,n € ZT we 
set m < nif there exists t € Z* such that n = m-+t. Note that 0 < n for all 
né Z*, sincen =n +0. We set m <nifm<nandm#n. Thus m < nif 
and only if there exists u € N such that n = m+ u. 


Theorem 2.1.5 Zt is well-ordered by the relation <. That is: 
(4) ifm <n andn <p thenm< p; 
(it) Ifm,n € Zt then either m <n orn <m; 
(iit) of m<n andn<m thenm=n; 
(iv) if A is a non-empty subset of Zt then there exists a € A such that 
a <a’ for alla’ € A (a is the least element of A, and so is the infimum 
of A; we denote it by inf A). 


Proof 


(i) ifm <n and n < p then there exist t,u in Z* such that n = m+ t and 
p=n+u. Then p=(m+t)+u=m-+- (t+), so that m < p. 
(ii) We use induction. Suppose that n € Z*. Let 


U,={meZt:m<norn<m}. 


Then 0 € U,. Suppose that m € U,. We consider two cases. First, 
suppose that m < n. Then n = m+ u for some u € N. Thus u=r+1 
for some r € Zt, andson=m+(r+1)=(m+1)4+7r;m+1<nand 
m+1€ Uy. Secondly, suppose that n < m. Then m = n + t, for some 
tE Zt, andsom+1l=n+4+t+1, andm+1 €U,. It therefore follows 
by induction that U, = Z*. 

(iii) If m <n and n < m then there exist t,u € Z* such that n = m-+t and 
m=n+u. Thusn+0=n=n+(t+u), so that t+u =0. By Theorem 
2.1.1 (iv), it follows that t = u=0, so that m =n. 
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(iv) Another proof by induction. Suppose that A does not have a least 
element. Let 
V={meZ*:m<aforallaec A}. 


Note that AN V = 0.0 € V, since 0 < n for all n € Z*. Suppose that 
m € V and that a € A. Since m ¢ A, m < a. Thus a = m +t, where 
té€N. Thust=r-+1 for some r € Z", so that 


a=m+(r4+1)=(m+4+1)+4r7, 


and m+ 1 < a. Since this holds for alla € A, m+1€ V. By induc- 
tion, V = Zt. Since AN V = 9, it follows that A is empty, giving a 
contradiction. 


The well-ordering property provides an alternative approach to induc- 
tion. Suppose that Q(z) is a well-formed formula, that T = {n € Zt : 
Q(n) is true} and that F = {n € Z* : Q(n) is false}. Suppose that we know 
that 0 € 7, and can show that if Q(n) holds then Q(n+1) holds. Then F = @. 
For if not, F has a least element f. Then f 4 0, and so f = n+ 1 for some 
néZ. But then n < f,so that n ¢ F. Thus n € T, and so f € T, giving a 
contradiction. 

Ifm <nand n= m-+t then we write m = n —t: we shall remove the 
restriction m <n in Section 2.4. Similarly, ifn = mk, with k 4 0, we write 
m = n/k and say that m divides n. We shall consider division further in 
Sections 2.5 and 2.6. 


Exercises 


2.1.1 (The complete induction principle) Suppose that Q() is a well-formed 
formula, that Q(0) holds, and that we can show that if Q(m) holds for 
all m <n then Q(n +1) holds. Show that Q(n) holds for all n € Zt 
(a) by induction, and 
(b) by using the well-ordering of Z*. 

2.1.2 Define m” recursively by m° = 1 (note that 0° = 1) and m™*1 = m™.m. 
Show that (m”)? = m”? and that (mn)? = mPn?. 

2.1.3 Show that n < 2” for all n € Zt. A number n € N is even if 2 divides 
n, and odd if not. Show that if n € N then there exist k € Zt and 
3 €N such that j is odd and n = 2*7. Show that k and j are uniquely 
determined by n. 

2.1.4 Show how to define n! so that 0! = 1 and (n+ 1)! = (n!)(n +1). 

2.1.5 The Fibonacci sequence (Fn)nez+ is defined by Fo = 0, Fi = 1, Fn42 = 
Fy t+ Fr41 for n > 1. 
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(a) Explain how this definition can be justified by recursion. The 
numbers that occur in the sequence are called Fibonacci numbers. 
(b) Show by induction that 2 divides F;, if and only if 3 divides k, and 
that 3 divides F), if and only if 4 divides k. When does 5 divide F,? 
(c) Show that Frak+1 = FF, + Fy Fry: 

2.1.6 Show that 5 divides 2?"+? + 3?” for all n € Zt. 

2.1.7 Suppose that (An)nez+ is a sequence of non-empty totally ordered 
sets and that A = [],¢g+ An- If z,y € A and « F y, let k(z,y) = 
inffn € Zt : an # yn}. If c,y € A, set x < y ifr = y or 
Lk(x,y) < Ye(a,y)- Show that this is a total order on A (the lexicographic 
order on A). 


2.2 Finite and infinite sets 


We are all familiar with the basic properties of finite sets. Nevertheless, we 
need to deduce these properties from Peano’s axioms. Since we shall be con- 
cerned with counting, we shall work with the natural numbers N, rather than 
with Zr. 

An initial segment I of N is a non-empty subset of N with the property 
that ifn € landm<nthenme I. 


Proposition 2.2.1 Jf J is an initial segment of I then either I = N or 
there exists n € N such that 1=I,={meEN:m<nb}. 


Proof It follows immediately from the definition of an initial segment that 
ifm ¢lIandn>mthenn ¢ I. If 1 4 N, then N \ J is non-empty; let mo 
be its least element. Suppose, if possible, that mop = 1. Ifn ¢ N, then n > 1, 
so that n ¢ I and J = 0. Thus mo > 1, and so there exists n € N such that 
mo =n+1. Then n € J, and so I, C I. But ifp >nthen p>n+1= mp, 
and sop € I. Thus I C In. 


So far, we have defined a sequence to be a mapping from Z* to a set A. 
We now extend the definition, to include mappings from N to A. A mapping 
f from an initial segment I to a set A is also called a sequence. If I = In, it 
is called a finite sequence in A of length n, or an n-tuple, and is denoted by 
(fj) oF (fts-+-s fa): 

We say that a set A is finite if either A is empty or there exists n € N and 
a bijective mapping c: I, — A. Thus the finite sequence (c1,..., Cn) lists the 
elements of A, without repetition. A set is infinite if it is not finite. 


Proposition 2.2.2) If 7: Im — In is an injective mapping then m <n. 
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Proof The proof is by induction on m. The result is trivially true if m = 1. 
Suppose that it holds for m, and that f : Im41 — In is injective. Then 
m+1> 2, so that f(Zm41) contains at least two points, and son =k +1, 
for some k € N. Let 7 : I, — I, be the mapping that transposes f(m + 1) 
and n and leaves the other elements of I, fixed. Then 7 0 f : Im41 — In is 
injective, and tT(f(Im)) C I,. By the inductive hypothesis, m < k, and so 
mt+1<k+l=n. 


Corollary 2.2.3 If A is a non-empty finite set, there exists a unique 
n EN for which there exists a bijection c: In — A. 


Proof Suppose that c : I, — A and c : I, — A are bijections. Then 


ctod : In > In is a bijection, and so n’ < n. Similarly, n <n’. 


The number n is the size or cardinality of A; it is written as |A|, or as 
#-(A). We assign the empty set size 0. 


Proposition 2.2.4 Suppose that A is a finite set, and that f : A — B is 
a bijection. Then B is finite, and |B| = |A|. 


Proof For if C': I\4) > Ais a bijection, then the mapping foc: I|4;— B 
is a bijection. 


Proposition 2.2.5 If A is a non-empty subset of I, then A has a greatest 
element. 


Proof Let U ={méN:a<™m for all a € A} be the set of upper bounds of 
A. Then n € U, so that U # 0). Let b be the least element of U. If b = 1 then 
A = {b}, so that b € A. Suppose that b ¢ A. Then b 4 1, and sob=c+1 for 
some c € N. But then c € U, contradicting the minimality of 6. Thus b € A, 
and 0 is the greatest element of A. 


Corollary 2.2.6 Jf A is a non-empty subset of I, with greatest element 
n, then A is finite, and |A| <n, with equality if and only if A= In. 


Proof We prove this by complete induction on n. The result is certainly true 
ifn = 1, since then A = {1} and |A| = 1. Suppose that it is true for alle < n, 
and that A is a subset of N with greatest element n+ 1. If A is the singleton 
{n + 1} then the result certainly holds. Otherwise, let A’ = A \ {n + 1}. 
Then A’ # 0, and so A’ has a greatest element n’ with n’ < n. By the 
inductive hypothesis, A’ is finite, and k = |A’| < n’, with equality only if 
A’ = In. Let ¢ : Ip — A’ be a bijection. If m € Ip44, let c(m) = c¢(m) if 
m<kand let c(k+1) =n+1. Then cis a bijection of I,,1 onto A, so that 
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|AJ=k+1<n'’+1 <n. Finally, k+1=n+1 only if k =n, in which case 
A'=I, and A= In41. 


Corollary 2.2.7 Suppose that B is a subset of a finite set A. Then B is 
finite, and |B| < |A], with equality if and only if B= A. 


Proof If Bis empty, then B is finite. If B is not empty then A is not empty, 
and there exist n € N and a bijection c : I, — A. Then c~'(B) is a non- 
empty finite subset of [,, and so there exists m € N, with m < n, anda 
bijection d: I, — c7!(B). Then co d is a bijection of I, onto B. Thus B 
is finite, and |B] = m < n= |A|. Equality holds if and only if e~!(B) = In, 
and this happens if and only if B = c(I,) = A. 


Corollary 2.2.8 Suppose that A is a non-empty finite set and that f : 
A-— A is an injective mapping. Then f is bijective. 


Proof Let c: I|4, + A bea bijection. Then foc: I)4) > f(A) is a bijection. 
Thus |f(A)| = |A|, and so f(A) = A. 


Dedekind defined a set A to be infinite if there is an injective map 7 : 
A — A which is not surjective; such sets are now called Dedekind infinite. 
For example, N is Dedekind infinite, since the mapping n — 2n: N — N is 
injective, and is not surjective. 


Corollary 2.2.9 <A Dedekind infinite set is infinite. 
Corollary 2.2.10 N is infinite. 


There are many other basic properties of finite sets, including those listed 
in the exercises. Use only induction, recursion, Peano’s axioms and the results 
derived from them to establish them. 


Exercises 


2.2.1 Suppose that A is a finite set, and that f : A — B is a surjection. 
Show that B is finite, and that |B| < |A|, with equality if and only if 
f is a bijection. 

2.2.2 Suppose that A is an infinite set and that f is a mapping from A into 
itself. Show that there exists a non-empty proper subset B of A such 
that f(B) C B. 
[Hint: consider the set 


{a € A: there exists n € N such that f"(a) = a}. 


Does the same hold for finite sets? 
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2.2.3 Show that if A is finite and f is a mapping from A to B then f(A) is 
finite. 

2.2.4 Show that if A and B are finite subsets of a set X then AU B is finite, 
and show that |AU B| + |AN B| = |A| + |B. 

2.2.5 Suppose that A,,...A, are finite subsets of a set X. Use induction 
and the result of the previous exercise to prove the inclusion-exclusion 
principle: 


|A1U-+-U Ap| 
=>0 (DEA, ARDS << Se Sh). 
k=1 


2.2.6 The pigeonhole principle. Suppose that f is a mapping from a set A 
to a finite set B. Show that if A is finite and |A| > |B then f is not 
injective. Show that if A is infinite, then there exists b € B such that 
f—1({b}) is infinite. 

2.2.7 A tennis club has more than one member. During a season, each mem- 
ber plays against none, some or all of the other members. Show that 
there are two members who play against the same number of other 
members. 

2.2.8 Suppose that M and W are non-empty finite sets and that H is a 
relation on M x W. If m € M, let h(m) = {w € W: (m,w) € A} 
and if A C M let h(A) = Umeah(m). Show that the following are 
equivalent: 

(a) |h(A)| > |A| for all AC M. 

(b) There exists an injective mapping y : M — W @ such that 
(m,x(m)) € H, for allme M. 
{Hint: use induction on |M|. Consider two cases: 

(i) |h(A)| > |A| for every non-empty proper subset A of M; 

(ii) there exists a non-empty proper subset A of MW for which |h(A)| = 
IAI 

This is Hall’s marriage theorem; M is a set of men, W is a set of 

women, and (m,w) € H if m and w know and like each other. 

2.2.9 Suppose that (kn)nez+ is a decreasing sequence in Zt — ifm >n 
then km < kn. Show that (kn)nen+ is eventually constant: there exists 
N EN? such that ifm > N then k,, = ky. 

2.2.10 Suppose that (A, <) is a non-empty totally ordered set for which each 
non-empty subset has a least element and a greatest element. If a € A, 
let U(a) = {b € A: a < b} be the set of strict upper bounds of {a} 
in A. Let s(a) be the least element of U(a) if U(a) is non-empty, and 


2.2.11 


2.2.12 
2.2.13 
2.2.14 


2.2.15 


2.2.16 


2.2.17 


2.2.18 


2.2.19 
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let s(a) = a otherwise. Show by recursion that there is a surjective 
mapping f : Z* — A such that f(m) < f(n) ifm <n. Show that A 
is finite. 

Suppose that (a1,...,@n) is a finite sequence in Z*. Show that there 
are sequences (S1,...,,) and (p1,...,Pn) such that s; = p; = a; and 
8541 = 87 + Gj41, Pj4i1 = Pj-4;41 for 1 <j <n. We write 


n n 
Sy = a1 +++++ay Or 8, = 5 Qj, Pn = @1.*** An OF rn = | [ a3. 
j=l j=l 


(This clearly will extend to other settings.) Suppose that o is a 
permutation of J,. Show that 


Show that 19+ 2° +.---+r3=(1+2+---+r)?, for allr EN. 

Show that 13 + 3° +---+(2n — 1)? = n?(2n? — 1) for alln € Zt. 

Show that any n € N®™ can be written as the sum of a strictly 

decreasing sequence of Fibonacci numbers. Is this representation 

unique? 

Suppose that A is finite and that (Ba)aea is a family of finite sets. 

Show that the Cartesian product [[,< 4 Ba is finite and determine its 

size. 

Suppose that A and B are finite. Show that B4 is finite, and determine 

its size. 

Suppose that A is finite. Show that P(A) is finite, and determine its 

size. By considering mappings f : A — {0,1}, relate this result to the 

previous one. 

Let %4 be the set of permutations of a non-empty set A. Show that if 

A is finite, then 4, is finite; determine its size. 

Suppose that A and B are finite. Let I be the set of injective mappings 

from A to B. 

(a) Determine the size of I. 

(b) Define an equivalence relation on I by setting f ~ g if f(A) = 
g(A). Determine the size of the equivalence classes. 

(c) Let (;) denote the size of the set of subsets of I, of size k. Show 
that if k <n then 
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(d) Prove de Moivre’s formula 


Tr) = Ge) +(e) 
and its generalization, Vandermonde’s formula, 

m+n . m n 

( k =X (7) (ens) 


(e) By considering the largest member of a subset of [,,41 of size k+1, 
show that 


a a) 


2.2.20 Suppose that k,,...,k, € Z? and that k, +--. +k, =n. Show that 


there are 
n! 
Kyla. kp! 
r-tuples (A,,...,A,) of pairwise disjoint subsets of I, with |A;| = k; 
forl<j<r. 


2.2.21 Show that if A is a non-empty finite set then the number of subsets of 
A of even size is the same as the number of subsets of A of odd size. 

2.2.22 Suppose that n,k € N. Show that n can be written as aj +---+ ag, 
with a; € Z* for 1 <i < k, in oS) distinct ways. How many 
distinct ways are there of writing n as b} +--- + bz, with b; € N for 
1<i<k? 


2.3 Countable sets 


A set Ais countable if it is finite or if there is a bijection c: N — A; otherwise 
it is uncountable. Thus a set is countable if it is empty or if there is a bijection 
from an initial segment of N onto A. The function c is called an enumeration 
of A. A set is countably infinite if it is infinite and countable. 

Thus A is countably infinite if and only if the elements of A can be listed, 
or enumerated, as an infinite sequence (ci, c2,...), without repetition. 

If A is countable (countably infinite) and 7 : A > B is a bijection, then B 
is countable (countably infinite). 

Not every set is countable, since it is an immediate consequence of Theorem 
1.6.3 that the set P(N) of subsets of N is not countable. It was Cantor who 
first showed, in 1873, that there are different sizes of infinite set, showing 
that the set of real numbers is uncountable. We shall prove this in Section 


2.3 Countable sets 43 


3.6, where we shall also describe the consternation which Cantor’s result 
produced. Meanwhile, let us concentrate on countable sets. 


Theorem 2.3.1 If A is a subset of N without a greatest element then 
there exists a unique strictly increasing function f : N > N (that is, f(n) < 
f(n+1) for alln € N) such that f(N) = A. 


Proof We construct the function recursively. Ifn € N then A, = A\ I, = 
{m¢€A:m>n} is non-empty, by hypothesis. Let g(n) be the least element 
of A,. Then g is a mapping from N to N, and g(n) > n for alln EN. 
By recursion, there exists a mapping f : N — N such that f(1) = g(1) 
and f(n+1) = g(f(n)), for alln € N. Since g(f(n)) > f(n), f is strictly 
increasing; further, f(N) C A. 

Next we show that f(N) = A. If not, let b be the least element of A\ f(N). 
Then 1 < f(1) < 6, so that the set AN {n € N: n < 5} is not empty. By 
Proposition 2.2.5, it has a greatest element c. Then g(c) = b. But c € A and 
c <b, so that c € f(N); if c= f(k), then b = f(k +1), giving the required 
contradiction. 

It remains to show that f is unique. Suppose that h: N — N isa strictly 
increasing function such that h(N) = A, and that h 4 f. Then there exists 
a least n such that h(n) € f(n). Since f(1) = h(1) = g, where g is the least 
element of A, n > 1. Suppose that h(n) > f(n). Then h(n -—1) = f(n—-1) < 
f(n) < h(n). But f(n) € A, and so f(n) = h(m) for some m € N. Since 
h is strictly increasing, n — 1 < m <n, giving a contradiction. A similar 


argument applies if h(n) < f(n). Hence f is unique. 


The mapping f is called the standard enumeration of A. 


Corollary 2.3.2 Suppose that A is a non-empty subset of N. If A has an 
upper bound in N, then A is finite; otherwise, A is countably infinite. 


Proof If A has an upper bound, then it is finite, by Proposition 2.2.5 and 
Corollary 2.2.6. Otherwise, A does not have a greatest element, so that there 


is bijection f : N — A, and A is countably infinite. 


Corollary 2.3.3. A subset B of a countable set A is countable. 


Proof If B is finite, then B is countable. If B is infinite, then A is infinite, 
and there exists a bijection g : N — A. Then g~!(B) is infinite, and so 
does not have a greatest element. By the theorem, there exists a bijection 
f:N—g7!(B). Then go f : N > Bisa bijection, so that B is countably 
infinite. 
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It is useful to have simple sufficient conditions for a set to be countable. 
The next proposition provides these. 


Proposition 2.3.4 Suppose that A is a set. The following are equivalent. 


(i) A is countable. 
(ii) Either A= or there exists a surjective mapping f : N — A. 
(itt) There exists an injective mapping j: AN. 


Proof Suppose that A is a countable non-empty set. If A is finite, there 
exists a bijection f : I|4) > A. Extend f to a surjection f : N — A by 
setting f(n) = f(1) for n > |A|. If A is countably infinite, there is a bijection 
of N onto A. Thus (i) implies (ii). 

Suppose that (ii) holds. If A is empty, then the empty mapping is an 
injective mapping of A into N. Otherwise, ifa € A then {n EN: f(n) =a} 
is non-empty; let g(a) be its least element. Then g : A — N is an injective 
mapping, and so (ii) implies (iii). 

Finally, suppose that (iii) holds. If A = 9, then A is finite, and so is 
countable. If A 4 @ and j(A) is bounded above, then j(A) is finite, and so A 
is finite. If A 4 @ and j(A) is not bounded above, let f : N — j(A) be the 
standard enumeration of j(A). Then j~! 0 f is a bijection of N onto A, so 
that A is countable: (iii) implies (i). 


In case (ii), each element of A is labelled, all the labels are used, but an 
element of A may have many labels. In case (iii), each element of A is given 
a separate label from N, but all the labels need not be used. 

When condition (ii) is used, it is important to remember that the empty 
set needs to be considered separately. 


Corollary 2.3.5 Ifg:A— B, and A is countable, then g(A) is countable. 


Proof If A is empty, then g(A) is empty, and so is countable. Otherwise, 
there exists a surjective mapping f of N onto A. Then go f is a surjective 
mapping of N onto g(A), so that g(A) is countable. 


Theorem 2.3.6 The set N x N is countable. 


Proof Suppose that (k,l) € N. The mapping f : N x N — N defined by 
f(k, 0) = 2*-1(21 — 1) is a bijection. (See Exercise 2.1.3.) 


Corollary 2.3.7 Jf A and B are countable sets then A x B is countable. 


Proof There exist injective mappings j4 : A — N and jg: B > N. If 
(a,b) € A x B, set j((a,b)) = (ja(a),jp(b)). Then 7: Ax BONXxN 
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is injective, so that the mapping f o 7 is injective. The result follows from 


Proposition 2.3.4. 


Corollary 2.3.8 If A is a countable set, and eacha € A is countable, then 
Uaeaa is countable. (The countable union of countable sets is countable. ) 


Proof First, lei B = {a € A: a 4 O}. Then Ugeaa = Usega, and so we 
can suppose that each a € A is non-empty. Secondly, if A is empty then 
Uaeaa is empty, and so is countable. Thus we can suppose that the set A, 
and each of the sets a € A, is non-empty. Using Proposition 2.3.4 (iii), there 
exists a surjection c: N — A, and for each m € N there exists a surjection 
fm: N — c(m). (Note that here we use the countable axiom of choice; in 
many specific cases, this can be avoided.) Now if (m,n) € N x N, we set 
g(m,n) = fm(n): we use m to select an index c(m) in A, and use n to select 
an element of c(m). Then g is a surjection of N x N onto Ugeaa, and so 


Uae aa is countable, by Corollary 2.3.5. 


If we assume the axiom of dependent choice, we can establish some 
properties of infinite sets. 


Proposition 2.3.9 Assuming the axiom of dependent choice, if A is an 
infinite set, then A contains a countably infinite subset. 


Proof Let S(A) be the set of finite sequences in A. If s = (ao,..-,@n) € 
S(A), let (2) = {(do,+++dns2) 2¥ € {a0y.++5n}}- Then o(s) 4 6. 


Let @ be an element of A. Let so = (@). By the axiom of dependent choice, 


there exists a sequence (5,,)°°.9 in S(A) such that 5,41 € O(sp), for n € ZT. 
Set bn = Gn.n, Where Sp, = (Gn,0,---;@nn)- By the construction, (bp)?29 is a 


sequence of distinct elements of A. 


Corollary 2.3.10 Assuming the axiom of dependent choice, if A is an 
infinite set then P(A) is uncountable. 


Proof If Cis acountably infinite subset of A, then P(C) C P(A), and P(C) 
is uncountable. 


Corollary 2.3.11 Assuming the axiom of dependent choice, an infinite 
set A is Dedekind infinite. 


Proof Let B be a countably infinite subset of A, and let (b1,b2,...) bea 
listing of the elements of B, without repetition. Let f(b;) = b2; for 7 € N, 
and let f(a) = a fora € A\ B. Then f is an injective map of A into itself, 
and A \ f(A) = {b1, 63, b5,...} is a countably infinite set. 
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Exercises 


2.3.1 Show that a finite product of countable sets is countable. What about 
a countable product of finite sets? 

2.3.2 The countable pigeonhole principle. Suppose that f is a mapping from 
an uncountable set A to a countable set B. Show that there exists b € B 
such that f~1({b}) is uncountable. 

2.3.3 Suppose that A is a countably infinite set. Determine which of the 
following sets are countable and which are not. 

(a) The set of finite subsets of A. 

(b) The set of permutations of A. 

(c) The set of permutations o of A for which o? is the identity. 

(d) The set of permutations 7 of A for which {a € A: T(a) 4 a} is 
finite. 

2.3.4 Let J be the set of mappings 7 : N — N for which j(m) < j(n) for 
m <n. Show that J is uncountable. 

2.3.5 Let D be the set of mappings d: N — N for which d(m) > d(n) for 
m <n. Show that D is countable. 

2.3.6 Suppose that B is a disjoint set of subsets of N: if A, A’ € Band A # A’ 
then AN A’ = 0. Show that B is countable. 

2.3.7 If A € P(Zt) and n € Zt, let fa(n) = 2” ifn € A and let fa(n) = 0 
otherwise. Let ga(n) = 07-9 fa(s), and let G(A) = {ga(n) :n € ZF}. 
Show that {G(A) : A € P(Z*)} is an uncountable subset of P(Z*) 
with the property that G(A) NG(A’) is finite, if A 4 A’. [Hint: consider 
the binary expansion of g4(7).] 


2.4 Sequences and subsequences 


A strictly increasing function from N to N defines a sequence in N. Such a 
sequence (n;,)P2, is called a subsequence of N, and the set {n, : k € N} is 
called the image of the subsequence. Theorem 2.3.1 shows that there is a one- 
one correspondence between the infinite subsets of N and the subsequences 


of N. 


Proposition 2.4.1 Suppose that (mz)72, and (nz)P, are subsequences 
of N, with images A and B respectively. If A C B then mz > nx for all 
kKeN. 


Proof We prove this by induction. First, ny = inf(A) < inf(B) = my. 
Suppose that m,z > nz. If mz, = nz then 


Ney, =inf{ae A:a>ngz} < inf{be B: b> mp} = meat. 
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If mz > ng then ng41 = inf{a € A: a> ng} < mE < Mp41.- 

Frequently we construct a sequence of subsequences, and use them to 
construct a further subsequence. This involves a diagonal procedure. 
Theorem 2.4.2 (The diagonal procedure) Suppose that (Cyaan is 
a sequence of subsequences of N, that A; is the image of (ni), forj EN 
and that (Aj) faa is a decreasing sequence. Let my, = nl’), fork EN. Then 
(mp)? is a subsequence of N, and mp € Ay fork > 1. 


Proof Ifk >I then mp € Ap C Aj, so that mz € Ay. We must show that 
(mz), is strictly increasing. This follows from Proposition 2.4.1, since 


(kK+1) __ 
< Npsy = Mk+1- 


ie nl) as Rey 


ioe) 
n=l 


Suppose that (an) 
quence of N. The composite (an,)72., is called a subsequence of (an)?2,. In 


is a sequence in a set A and that (n,)72, is a subse- 


fact, it would be more accurate to define the subsequence as the ordered pair 
((Gn) p14, (Me)P21), Since the set {nz : k € N} is important. We call it the 
support of the subsequence, and denote it by supp (an, )721- 

Let us give an important example. 


Theorem 2.4.3 Suppose that (an)?_, is a sequence in a totally ordered 
set A. Then there exists a subsequence (an, )P2., such that either 


(i) ifk <1 then an, < Gn, ((An,)721 ts strictly increasing), or 
(ii) ifk <1 then an, > Gn, ((Gn,)?21 is strictly decreasing), or 
(iii) if k <1 then an, = Gn, ((Gn,)~24 is constant). 


Proof Let us say that an index n is a high point if an > am for all m > n. 
There are two possibilities. First, there are infinitely many high points n, < 
ng <---. In this case, (dn, )?2, is strictly decreasing. Secondly, there are only 
finitely many high points. In this case, there exists N such that ifn > N then 
n is not a high point, so that there exists a least m > n with am, > an. We 
can therefore recursively find a sequence (nj < ng <---) with ny = N and 
Qnj.,; 2 4m, for all 7. Then either there exists k such that an, = an, for all 
j > k, in which case we have a constant subsequence, or we can extract a 


further subsequence which is strictly increasing. 


Theorem 2.4.3 is a consequence of a much more general theorem. This 
has considerable theoretical importance, but we shall not use it later. It may 
therefore be omitted on a first reading. First we introduce some notation and 
terminology. Suppose that C is a finite set and that f : A — C is asurjective 
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mapping. Then we call f a colouring of A. The elements of C' are the colours; 
a has colour f(a). The collection of sets {f~'({c}) : ¢ € C} partitions A into 
sets of different colour. 

If A is a set and k € N, we denote by P;(A) the set of all subsets of A of 
size k. We identify P(A) with A. If B C A then P,(B) C P,(A). 


Theorem 2.4.4 (Ramsey’s theorem) Suppose that f : Py(N) — C 
is a colouring of P,(N). Then there exists an infinite subset M = 
{ny < no <---} of N such that f(Pr(M)) is a singleton: all the subsets 
of M of size k have the same colour. 


Proof The proof is by induction on k. The result is true if k = 1, since the 
finite collection of sets { f~!({c}) : c € C} is a partition of the infinite set N, 
and so, by Exercise 2.2.5, one of the sets f~'({c}) must be infinite. We take 
this for M. 

Suppose that the result is true for k, and that f : Py4i(N) - Cisa 
colouring of Pyii(N). The sets in Pyii(N) have k + 1 elements, and, in 
order to use the inductive hypothesis, we need to relate them to sets with 
k elements. First, let b} = 1 and let D) = {n EN: n> bi}. 1f Be 
P,(D1), let gi(B) = f({bi} U B); then gi is a colouring of P;,(D1). By the 
inductive hypothesis, there exist c) € C’ and an infinite subset Ey of Dy; 
such that gi(B) = c for all B € P,(£,). Thus f(A) = c for those A 
in Pr4i({bi} U F)) for which b; € A. But of course there are many other 
subsets in Py41({bi} U F,). We therefore iterate the procedure. 

We use recursion to show that there exists a sequence (bn, En, Cn)°2), 
where (b,,)°2, is a strictly increasing sequence in N, (£,)°°, is a strictly 


decreasing sequence of infinite subsets of N and (c,)°° 


p-1 IS a sequence of 


colours, with the following properties: 


(i) b, < e for alle € Ej; 
(ii) bp41 is the least element of E,, 
(iii) f({bn} U A) = cp for all A € Py (En). 


We have found (b;, £1, c,). Suppose that we have found (b;, Ej, c;) which 
satisfy the conditions, for 1 < 7 <n. Let b,41 be the least element of E,,. Let 
Dati = En \ Lina}: IfAe Pil Dart); we set 9n+1(A) = f(Onzi U A). Then 
9n+1 is a colouring of Py(Dn+1). By the inductive hypothesis, there exists 
an infinite subset F,41 of Dy41 and cn41 € C such that if A € Fy,+41 then 
f(On41 U A) = gn41(A) = Cn41. This establishes the recursion. 

Now consider the sequence (c,)°2,. If c € C, let A. = {n EN: cp = ch. 
The finite collection {A. : c € C} of subsets of N forms a partition of the 
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infinite set N, and so one of them, A;, say, must be infinite. We take this 
to be M. Finally, we show that if A € Py41(M) then f(A) = co, so that 
M satisfies the conclusions of the theorem. If A € Pyii(M), we can write 
A = {b,}U B, where b,, is the least element of A and B = A \ {b,}. Then 
bn, € M and B € P,(Epn), so that f(B) = co, by (iii). 


Let us now see how Ramsey’s theorem can used to prove Theorem 2.4.3. 
Consider P(N). If m <n, colour the unordered pair {m,n} red if am < dn, 
yellow if am > ay and blue if a,, = a,. Then there exists an infinite subset 
M = {n1 < ng < ---} such that the sets {n;,n,} with 7 # k all have the 
same colour. Thus the sets {nj,j+1} all have the same colour. If the colour is 
red, we have a strictly increasing subsequence; if yellow, a strictly decreasing 
subsequence; and if blue, a constant subsequence. 


Exercises 


2.4.1 Suppose that (A,,)°2, is a sequence of subsets of a set A. Show that 
there exists a subsequence (Ap, )7?2, which is either constant, or strictly 
increasing, or strictly decreasing, or such that if k Al then A,, Z An, 
and An, Z An,. 

2.4.2 Suppose that (gn)°°, is a sequence in a group G. Show that either there 
is a sequence (gn, )?2, such that gn,9n, = Gn, Jn, for k,l € N or there 


is a sequence (gn, )@2., such that gn, Jn, A In Gn, if k A I. 


2.5 The integers 


Our next task will be to adjoin a set —N of negative numbers to Z* to obtain 
the set Z of integers. There are many ways of doing this. We use a rather 
naive one, which involves a certain amount of case-by-case checking. Another 
method appears in Exercise 2.7.5. 

Define a mapping n — n* from Z* to Zt x Zt by setting n* = (0,n), 


and define a mapping n > n~ from N to Z* x Zt by setting n~ = (n,0), 
and set 
Z={nt:neZtyU{n :neN}. 


We define addition in N by setting 
nt+m*=(n+m)", 


n +m =(n+m) , and 


i 
n' +m =m 
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Note that addition is commutative: if p,q € N then p+q=q+>p. 
We now verify that addition is associative; we do this case by case. 
Certainly 


(m* +n*)+ pt =(m+n-+p)* =m* + (n* +p*) 
and (m” +n-)+p =(m+n+p) =m 4+(n +p). 
Next, 
(m+n—p)t ifm+n>p, 


+ nt — _ +p — 
ESTES Oe Seer ifm+n <p, 


while 


mt+(n—p)t=(m+n—p)* ifn>p, 
mi +(n*+p)=<mt+(p—n)-=(m+n-p)t ifmtn>p>n, 
m \-=(p—m—n) ifmin<p. 


Thus (mt +n*) + p~ =m* +(n* 4+ p7). Using this, and the commutative 
property, we find that 


(mt +p-)+nt =nt4+(m*>+p-) =(nt+m*) +p 
(m+ +nt)+p° =mt 4+ (nt +p) 
=(m*+n")+p, 


and the other cases are dealt with in a similar way. 

Note also that 0* acts as an identity: if p€ N then p+0t =0t+p=p, 
and ifn € Z* thenn* +n7~ =O. 

Thus we have the following. 


Theorem 2.5.1 (Z,+) is an abelian group with identity element 0*, gen- 
erated by 1+. The mapping 0: Zt — Z defined by 0(n) = n* is an injective 
mapping of Zt into Z, and O(n +m) = O(n) + O(m). 


In particular, —(n*) = n~ and —(n~) = nt. 

The set Z is the set of integers. We identify Zt with 6(Z*), and N with 
O(N). Thus Z = Z* U (-Zt) = NU {0} U(—N), and the latter is a disjoint 
union. If n € N, we say that n is positive; ifn € Z*, we say that n is non- 
negative; ifn € —N we say that n is negative, and ifn € —N U {0} we say 
that n is non-positive. 

The fact that (Z,+) is a group is important; it leads to useful algebraic 
results. 
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Proposition 2.5.2 Suppose that (G,o) is a group and that g € G. Then 
there exists a unique homomorphism @ of (Z,+) into G for which @(1) = g. 


Proof We define ¢ recursively on Z*. Define a mapping r : G — G by 
setting r(h) = hog, for h € G. By recursion, there exists a unique mapping 
¢:Z* — G such that ¢(0) = eq, the identity in G, and d(n+ 1) = r(d(n)). 
Set g” = ¢(n). Then 


g’** = b(n +1) = 9(n)og=g" 09; 


men — g™ og", for m,n € Zt. Now define 


a+b _ 


an easy induction shows that g 
g-" =(g")—1, for —n € N~. It is again straightforward to check that g 
g* 0 g° for a,b € Z. In particular, g” og" = g~” 
is the inverse of g”. Finally, uniqueness follows from the uniqueness of the 


n 


og” = e, so that g™ 


recursion. 


The image $(G) is a subgroup of G. It is the smallest subgroup of G which 
contains g, and is denoted by Gp(g). If Gp(g) = G, we say that G is a cyclic 
group, with generator g. 


Proposition 2.5.3 The additive group (Z,+) is a cyclic group, with 
generator 1. 


Proof Let Gp(1) be the subgroup of Z generated by 1. Then 0 € Gp(1). By 
induction, n € Gp(1) for all n € N. But then —n € Gp(1) for all n € N, and 
so Gp(1) = Z. 


Next, we define an order on Z. We set k < j ifj —ke Zt. If j -—ke Zr 
and k—1 € Z* then j —1 = (fj —k) +(k-—1) © Z*; thus if k < j and 
1< kthenl < 7. Ifk £ 7 then j —k ¢ Zt, so that 7 —k € N-, and 
k-j =—-(j-k) € NC Z*. Thus j < k. Consequently < is a total order 
on Z. Note that 7 < k if and only if 7 +1 <k+1, for any j,k,l € Z. We 
can arrange the integers in increasing order as a doubly infinite sequence of 
terms: 


ce Oe OT Oe A ted, 


The order and the group structure of (Z,+,<) are related. An ordered 
group is a group G, together with a total order on G with the property that 
ifg <g andhe€ Gthen hog < hog and goh < g' coh. We denote the 
set {g € G: e < g} by G*. The preceding remarks show that (Z,+,<) is an 
ordered group. Further, the set Z* is well-ordered, and Z has at least two 
elements. We now show that these properties characterize Z. 


52 Number systems 


Theorem 2.5.4 Suppose that (G,o,<) is an ordered group with at least 
two elements and that G* is well-ordered. Then there exists a unique order- 
preserving group isomorphism 6 of (Z,+,<) onto (G,o,<). 


Proof We do not assume that G is an abelian group, and so we write the 
group operation as multiplication. If g € G, then either g or g~! is in Gt 
(if g € Gt then g < e; composing with g-', e = gog ! < g™1, so that 
g-' € G*). Since G has at least two elements, the set P = {g € G: e < g} 
of strictly positive elements is not empty. Let 1g be the least element of 
P. By Proposition 2.5.2, there exists a unique homomorphism 6 : Z — G 
with 6(1) = 1g. An easy induction shows that 0(N) C P. Suppose that 
j,k € Zand that 7 < k. Then k —j EN, so that 0(k — 7) € P. Thus 
e < O0(k —j) = 0(k) 0 (0(7)) +. Multiplying by 6(j), we see that 6(7) < 0(k); 
@ is order-preserving. Since the order on Z is a total order, it follows that @ 
is injective. 

Next we show that 6(Z) = G. If not, there exists g € G \ 6(Z). Since 0(Z) 
is a subgroup of G, g #4 0, and —g € G \ 0(Z). As before, one of g and —g is 
strictly positive, and so P \ @(Z) is non-empty. Let go be its least element. 
Since 15' = 0(-1) € 0(Z) and go ¢ 0(Z), it follows that 1¢' 0 go ¢ O(Z). 
Since e = 0(0) € O(Z), it follows 15' o go # e. Since 151 © go < go and since 
go is the least element of P \ 6(Z), it follows that ie ° go < e. Multiplying 
by le, it follows that go < lg. But go € P and 1g is the least element of P, 
and so we have a contradiction. 


Uniqueness then follows from Proposition 2.5.2. 


What about multiplication? We want to extend the multiplication defined 
on Z*, and to preserve the distributive law. Thus if m,n € Zt we require 
that 


m.n+m.(—n) = m.(n+ (—n)) = m.0 = 0 and 
0.m =0, 


n.m + (—n).m = (n+ (—n)).m 


so that m.(—n) = —(m.n) = —(n.m) = (—n).m. In particular, we require 
that 0.(—n) = (—n).0 = 0. Similarly we require that 


(—m).n + (—m).(—n) = (—m).(n + (—n)) = (—m).0 = 0 and 
(—m) = 0, 


3 
| 
S 
+ 
| 
5 
| 
= 
| 
= 
+ 
| 
= 
| 
S 
] 
j=) 
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Summing up, we have the following multiplication table: 


0 meN me—-N 


0 0 0 0 
neN 0 nm —nm 
—ne-N 0 —nm nm 


With this multiplication, we have the following extension of Theorem 2.1.2 
and Theorem 2.1.3. 


Theorem 2.5.5 Suppose that j,k,l © Z. 


i) ~2k=h4 (commutativity); 

(i) O07 =0, 1.7 = 7 and (-1).j = —-3J; 

Gait) (al = AED) (associativity); 

Ge) oka=— LE ond kU then 7 =! (cancellation); 

(eo) Wr kR=0 ther; =U of f=, 

(vi) 7 (K+)D =(-k)4+ GD (the distributive law). 


Proof The proof is again left as an exercise for the reader. 


Again, we can write jk for j.k. Then (jk)l = j(kl) = jkl. We write 
(jk) + (gl) = 9k + jl; multiplication is carried out before addition. 


Exercises 


2.5.1 Suppose that « € Z and that x 4 0. Show that x? > 0. 
2.5.2 Show that Z is countable. Define an explicit bijection from N onto Z. 


2.6 Divisibility and factorization 


We now consider divisibility in N and in Z. If 7 and & are in Z, we say that 
j divides k, and write j|k, if there exists gq € Z such that k = qj. It follows 
from Corollary 2.1.4 that the only elements of Z which divide every element 
of Z are 1 and —1: we call them the units of Z. 

In order to study divisibility, we first consider the additive group (Z, +), 
and ask the question: what are the subgroups of (Z, +)? Suppose that n € N. 
By Proposition 2.5.2, there is a homomorphism 0 : Z — Z such that @(1) = n. 
Then 

0(Z) = Zn = {k € Z: k = jn for some j € Z} = {k € Z: nk}, 


so that Zn is a subgroup of (Z, +) (note that ZO = {0} and Z1 = Z). Ifn £0 
then n is the least positive element of Zn, and so Zm # Z, ifm # n. 
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These subgroups are useful when considering division with remainder. 


Proposition 2.6.1 Suppose that m,n € N. There exist q,r € Zt, with 
O0<r<n such thatm=qn-+r. 


Proof Let L = {j € Zn: 0 < j < m}. Since 0 € L, L is a non-empty finite 
set, and therefore it has a greatest element | = qn. Let r = m — qn, so that 
r > 0. Since gnt+-n= (q+1)n ¢ L,m < qn+n, and sor =m-—ngq <n. 


In fact, the subgroups Zn are the only subgroups of Z. 


Proposition 2.6.2 Jf H is a subgroup of (Z,+) then H = Zn for some 
neZ. 


Proof If H 4 {0} = ZO, then, since h € H if and only if —h € H, the set 
HON of positive elements of H is non-empty. Let n be its least member. 
Then Zn C H. We shall show that H = Zn. Suppose that m € H and that 
m is positive. By the previous proposition, we can write m = qn +r, where 
0<r<_n. But qn © H, and sor = m— qn € H. Since n is the least 
positive element of H and r < n, it follows that r = 0. Thus m = qn € Zn. 
If m € H and m is negative, then —m € H, so that —m © Zn; consequently 
m € Zn. 


Now let us return to divisibility. We restrict attention to N. The relation 
m|n is a partial order on N, since if m|n and n|p then m|p, and since if 
m|n and n|m then m = n. A partially ordered set (A,<) is called a lattice 
if whenever a and 0 are elements of A then the set {a,b} has an infimum, 
denoted by a A b, and a supremum, denoted by a V 6. 


Theorem 2.6.3 (i) The partially ordered set (N,|) is a lattice. 

(it) If m,n € N then there exist k,l © Z such that mAn = km+In 
(Bachet’s theorem). 

(itt) (mMAn)(MV n) = mn. 


The element mA n is called the highest common factor of m and n, and 
is traditionally written as (m,n) [risking confusion with the ordered pair 
(m,n)]; 7m V nis called the lowest common multiple of m and n. 

Bachet’s theorem is frequently called Bézout’s lemma; Bachet established 
the result in 1624. 


Proof Suppose that m,n € N. Let 


H={heZ:h=um+vun, for some u,v € Z}. 
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Then m=1.m+0.n € H andn=0.m+1.n € H. Since 


(um +n) + (u'm+v'n) = (ut+u')m+ (v+v')n and 


—(um + un) = (—u)m + (—-v)n, 


HT is a subgroup of (Z,+). Further, H is the smallest subgroup of (Z, +) 
containing m and n, since if K is a subgroup of (Z, +) which contains m and 
n then it contains all the elements um + vn, with u,v € Z. We call H the 
subgroup generated by m and n, and denote it by Gp(m, n). By Proposition 
2.6.2 there exists h € Z* such that H = Zh. Since H # {0}, h > 0. Then 
there exist k,l € Z such that h = km + In. Since m,n € H, h|m and hAln. 
Suppose that h’|m and h’|n. Then h’|(km + In) and so h’|h. Thus h is the 
highest common factor of m and n. 

Similarly Zm/M Zn is a subgroup of (Z,+), and mn € ZmN Zn, so that 
Zmn Zn # {0}. Thus there exists g € N such that ZmN Zn = Zg. Since 
g € Zm, mg, and similarly n|g. If mlg’ and ng’ then g’ € Zm and g' € Zn, 
so that g’ € Zg. Thus g|g’, and so g is the lowest common multiple of m 
and n. 

We now show that mn = hg. Recall that h = km+ In. Since m|g, mn|lng, 
and similarly mn|kmg; Thus mn|(km + In)g; that is, mn|hg. On the other 
hand, m = sh and n = th for some s,t € N. Then m|sth and n|sth, so that 
sth is acommon multiple of m and n; consequently, g|sth. Thus hg|sth?. But 
sth? = mn, and so hg|mn. Consequently mn = hg. 


If the highest common factor of m and n is 1, we say that m and n are 
coprime, or relatively prime. Bachet’s theorem has the following consequence. 


Proposition 2.6.4 IJfm and n are coprime, and m|nr, then mlr. 


Proof There exist k,l € Z such that 1 = km-+/n, and so r = kmr + Inr. 
Since m divides each term on the right-hand side of this equation, it also 


divides r. 


Theorem 2.6.3 establishes the existence of the highest common factor of 
two numbers, but it does not tell us how to find them. For this, we use 
Euclid’s algorithm; this was given in Euclid’s Elements. This also enables us 
to determine the constants in Bachet’s theorem. 

It is convenient to work with Z? = Z x Z with its product group struc- 
ture: the identity element is (0,0), (j,k) + (7’,k’) = (G + &,j’ + k’) and 
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—(j,k) = (—j,—k). Any element (j,k) of Z? can be written uniquely as 
je, + kez, where e; = (1,0) and eg = (0,1). Thus if 6 : Z? — Z? is a homo- 
morphism, then 6((j,&)) = j@(e1) + k@(e2). We can express @ in terms of 
matrices: if O(e,) = (011, 012) and 0(e2) = (021, 022) then 


. . 6 0 : . 
9((3,k)) = (9, &) | ie ns | = (j611 + k021, j012 + k22). 


Suppose that mg > no > 0 and that we want to find hyp = mo A no. Thus 
we want to find ho such that Gp(mo, no9) = Zho. 

We divide: by Proposition 2.6.1, there exist gg and ro, with 0 < rg < ng 
such that mo = gono + ro. We set m, = ng and n1 = ro. Thus 


1 
—40 


(mo, 20) = (m4, 71) B 5 = (m1,71) Nj, say). 


(mi,n1) = (mo, No) | i | = (mo,no)Mj, say, and 


From these equations, it follows that m and n; are in Gp(mo, 79), so that 
Gp(m1,71) C Gp(mo, no), and that mp and no are in Gp(m1, 71), so that 
Gp(mo, 0) C Gp(m, 21). Thus 


Gp(m1, m1) = Gp(mo, no) = Gp(ho). 
If no|mp then ny = 0 and m, = ho. Otherwise, if hy = mj, A ni, then 


Gp(hi) = Gp(m4,n1) = Gp(ho), so that hy = ho; in this case we iterate 
the procedure. Since 0 < n; < nj—1, the procedure must stop after a finite 


number k of iterations. Then mz, = hy) = --- = ho and nz = O. Since we 


can write (mj,nj) = (mj—1,nj-1)M; for 1 <j < k, it follows that 


(9193723) = (mo, No) My ane .M; — (mo, no) P;, 
where P; = M,...Mj = Pj-1Mj. 


At each stage we can calculate the product Pj—1M,;, and so calculate P;. In 
particular, (ho,0) = (mz, n~) Px, so that if 


— es P12 
P21 p22 


| then ho = p11mo + p2ino. 
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Let us give a numerical example. Let mp = 1677 and np = 1131. Then 


0 


go=1, ro =546, (mji,n1) = (1677, 1131) i 


‘1 = (1131, 546) 
0 


ae 2,. 1 = 39. Crissy = (1131546) : 


‘| = (546, 39) 


1 
P21 Get. . Gps =e f > = (39,0). 


Thus the highest common factor of 1677 and 1131 is 39. Further 


pees mee al a ce 
BO a SEEN Mal: SOE D's seen ha edie iP? 
so that 39 = —2.1677 + 3.1131. 
We now turn to factorization. Our aim is to factorize a number as a product 
of simpler numbers. An element p of N is a prime, or a prime number, if it 


is not a unit (that is, is not equal to 1), and if the only elements of N which 
divide it are 1 and p. Bachet’s theorem provides an equivalent definition. 


Proposition 2.6.5 Suppose that p € N and p 4 1. The following are 
equivalent: 


(i) p is a prime; 
(it) if plmn then p|m or pin. 


Proof Suppose that p is a prime, that p|mn and that p does not divide m. 
Then the highest common factor of m and p is 1, and so by Bachet’s theorem 
there exist k,l € Z such that 1 = km -+ Ip. Thus n = kmn + Ipn. Since p 
divides each of the terms on the right-hand side, p divides n. 

If gq is not a prime, then gq = mn for some m,n not equal to 1 or gq. 
Then g|mn, but q does not divide either m or n, since m and n are smaller 


than q. 


Theorem 2.6.6 (The fundamental theorem of arithmetic) Jfn¢N and 
n> 1 then n can be written uniquely as a product p,...pp of primes, with 
DiS po S * "Spas 


Proof First we use complete induction to show that n can be written as a 
product of primes. 2 is a prime, so 2 = p; with py = 2. Suppose that the 
result holds for m with 2 < m < n. Let A be the set of divisors of n which 
are greater than 1. A is non-empty, since n € A, and so A has a least element 
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p. p taust be a prime, for otherwise p = ab, with a,b > 1; then a € A and 
a <p. Then n = pq for some gq < n. By the inductive hypothesis, we can 
write q = p,... px as a product of primes, with py < po <--- < ppg. Since 
pi € A, p< pi, and n = pp,... pr. 

It is harder to show that the factorization is unique. Again we prove this 
by complete induction. It is certainly true when n = 2. Suppose that the 
result holds for m with 2 < m < n. Let n = p,...p, = q1...q be two 
factorizations into primes, with py < po <--- < pp and q <q <-:: <q. 
Let s = po... pp and t = qo...q, so that n = pis = qt. First we show that 
p1 = qi. Suppose not, and suppose without loss of generality that py < qq. 
Since p;|qit, and since p; does not divide qi, pi|t, so that t = p,u, for some 
u €N. whas a factorization u = ry ...1rm into primes, and so t = pir, ...Tm 
is a factorization into primes. Since t < n, the factorization is unique when the 
terms are rearranged in increasing order. Since t = q2...q, with qa <--- < 
di, 92 is the least of p1,71,...,%m, and so q < go < py, giving a contradiction. 
Thus p; = q,. Hence s = po...ppy = t = qo... q. But s < n, and so the 


factorization of s is unique. Thus k = | and p; = q; for 2 <j <k. 
Corollary 2.6.7 There are infinitely many primes. 


Proof Suppose, on the contrary that there are only finitely many primes 
Pi,-++,Pr. Let n =p, ...pzy +1. Then p; does not divide n, for 1 < 7 < k, so 
that n has no prime divisors. 


Exercises 


2.6.1 Suppose that (X,<) is a lattice. Show that (aAb) \c=aA(bAc). Is 
(aA b) Vc=aA (bV c) always true? 

2.6.2 Show that a maximal element of a lattice is the greatest element of L. 

2.6.3 Show that the subgroups of a group, ordered by inclusion, form a 
lattice. 

2.6.4 What is the highest common factor of the Fibonacci numbers F414 
and F,,? How many steps does Euclid’s algorithm take to evaluate it? 
What is the highest common factor of the Fibonacci numbers F;,+9 
and F,,? 

2.6.5 Use Euclid’s algorithm to find numbers m and n such that 81m — 
100n = 1. 

2.6.6 Recall that two natural numbers a and b are coprime if their highest 
common factor is 1. Use Bachet’s theorem to show that if a and 6 are 
coprime and a and ¢ are coprime, then a and bc are coprime. Give 
another proof, using Theorem 2.6.6. 
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2.6.7 Show that given k € N there exists n € N such that n+ 7 is not a 
prime, for 1 <j <k. 
2.6.8 By considering numbers of the form 4p; ...p, —1, show that there are 
infinitely many primes of the form 4¢ — 1. 
2.6.9 Show that there are infinitely many primes of the form 6¢ — 1. 
2.6.10 Suppose that p is a prime. Show that p divides (?) forl<r<p. 


2.7 The field of rational numbers 


In Z, we can add, multiply, and subtract, but, as we have seen in the previous 
section, division is very limited, but also very interesting. In this section, we 
embed Z in a set Q of quotients, in which we can add, subtract, multiply and 
divide (but not by 0), according to the usual laws of algebra. 

Let us make this last remark explicit. A field is a set F’, together with two 
laws of composition, addition (+) and multiplication (0), with the following 
properties. 


(i) (F'+) is an abelian group, with identity element 0. 
(ii) Let F* = F\ {0}. Then (F*, 0) is an abelian group under multiplication, 
with identity element 1. 
(iii) There is a distributive law: 


ao(b+c) =(a0b)+ (acc), fora,b,ceE F. 


Note that (b+c)oa = (boa)+(coa), by the commutativity of multiplication. 
Note also that 1 € F*, so that 0 4 1, and that ao0 = ao(04+0) = ao0+a00, 
so that ac 0 = 0; Similarly 0 oa = 0. We denote the additive inverse of a by 
—a, and the multiplicative inverse (if a 4 0) by a7. 

As an example, let Zz consist of two elements 0 and 1. With the following 


laws of addition and multiplication 


0+0=141=0; 041=14+0=1; O000=00l=100=0; lLlol=1, 


Z» becomes a field. 


Proposition 2.7.1 Suppose that F is a field and that ¢: (Z, +) — (F,+) 
is the homomorphism of Proposition 2.5.2. Then ¢(mn) = $(m)@(n) for 
m,n € Z. 
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Proof Suppose that n € Z. If m € Z, let Un(m) = d(mn) — o(m)¢e(n). If 
my1,m2 € Z then 


o((my + m2)n) = o(min + man) = o(m nN) + O(mgn), and 
d(m1 + mM2)9(n) = (P(m™m1) + G(mM2))d(n) = O(m1) G(n) + (m2) 4(n), 


so that wn(mi + m2) = Un(m1) + Un(m2). Thus Wn is homomorphism of 
(Z,+) into (F,+). But %,(1) = d(n) — nd(1) = 0, and so wn(Zt) = {0}. 
Thus ¢(mn) = (m)¢(n). 


A subset H ofa field F' is a subfield of F if H is a subgroup of the additive 
group (F,+) and HN F* is a subgroup of the multiplicative group F*. It 
then inherits the field structure from F’. 

A mapping @ from a field F' to a field G is a field homomorphism if 


e it isa homomorphism of the additive group (F,+) into (G,+), and 
e O(F*) C G* and 6p. isa homomorphism of the multiplicative group (F*, 0) 
into (G*,o). 


In particular, if @ is a field homomorphism then 6(0f) = 0g and 0(1r) = le. 

Suppose that 0: F — G is a field homomorphism, and that f and f’ are 
distinct elements of F. Let h = f — f’. Then h ¥ 0p, and 6(h)0(h~') = le, 
Thus 0(f) — 6(f’) = A(h) 4 0a, so that 6(f) 4 O(f’). Consequently, 0 is 
injective. 

A surjective field homomorphism is called a field isomorphism. 

Suppose that F is a field. A polynomial over F of degree n is an expression 
of the form p(x) = An x” +an—1 0"! +--+ a,x+a0, where the coefficients a; are 
in F' and ay, # 0. It is monic if an = 1. The polynomial p defines a polynomial 
function p: F — F defined by setting p(r) = anr” + an_ir™ 1 +--+ ayr+apo. 
An element r of F is a root of p if p(r) = 0. 

We shall embed Z in a field Q. We are all familiar with the notion of a 
fraction, and of the fact that different fractions, such as 2/3 and 4/6, represent 
the same number. Let us formalize this. Let Z* = Z \ {0} be the set of non- 
zero integers. We define a relation on Z x Z* by setting (p,q) ~ (r,s) if 


ps = qT. 


Proposition 2.7.2 The relation (p,r) ~ (q,8) is an equivalence relation 
on Z x Z*. 


Proof It follows immediately from the definition that (p,q) ~ (p,q) and 
that if (p,q) ~ (r,s) then (r,s) ~ (p,q). Suppose that (p,q) ~ (r,s) and 
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(r,s) ~ (t,u), so that ps = gr and ru = ts. Thus 


pusr = (ps)(ru) = (qr)(ts) = gtsr, 


so that (pu — qt)sr = 0. Since sr 4 0, pu = qt, and (p,q) ~ (t, wu). 


We denote the set of equivalence classes by Q, and denote the equivalence 
class [(p, q)] by p/q, or ie The elements of Q are called rational numbers. If 
r = p/q € Q, we call p/q a fraction, representing r. Many different fractions 
represent r; for example, 2/3 and 4/6 represent the same element of Q. It 
follows immediately from the definition of the equivalence relation on Z x Z* 
that 7/k = j’/k’ if and only if jk’ = 7k. In particular, j/k = (—j)/(—k), so 
that we can represent r as j/n, where 7 € Z* andn EN. 

Let us consider the structure of the equivalence classes further. 


Proposition 2.7.3 (i) Suppose that (m,n) © Nx N. Then there exists a 
unique (m’,n') € [(m,n)] with m! and n' coprime. Then 


[(m,n)] ={a EN x N: a= (km’,kn’) for some k € N}. 


(ii) Suppose that (—m,n) € NXxN. Then there exists a unique (—m’,n’) € 
[(—m,n)] with m! and n' coprime. Then 


[(-m,n)] = {a € -N x N: a= (—km’, kn’) for some k € N}. 


Proof (i) Let h be the highest common factor of m and n, and let m! = m/h, 


n' = n/h. Then m’ and n’ are coprime, and mn’ = hm’n’ = m'n, so that 
(m,n) ~ (m',n’). If (m",n") € [(m,n)] then m’n! = m'n", so that m'|m", 
by Proposition 2.6.4. Let m” = km’; then km’'n! = mn! = m'n"; dividing 
by m’, we see that n” = kn’. Thus (m",n") = (km’, kn’). From this it follows 


that (m’,n’) is the only element of [(m,n)] with m’ and n’ coprime. 


The proof of (ii) is essentially the same as the proof of (i). 


In other words, if r € Q*, we can write r uniquely as r = m/n or r = 
(—m)/n, with m and n coprime. In this case, we say that the fraction m/n is 
in lowest terms. As an example, a dyadic number or dyadic rational number 
is a rational number of the form m/2*, where m € Z and k € Zt. Ifk > 1 
then it is in lowest terms if and only if m is odd. 

We now show how to define addition and multiplication in Q, so that Q 
becomes a field. We give the details, though they are very straightforward. 
First we define addition. We define p/q+r/s = (pst+qr)/qs. If (p,q) ~ (p', 7) 
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and (r,s) ~ (7’, s’) then 


(ps + qr)q's' = (pq')(ss') + (rs')(aq’) 
= (p'q)(ss") + (rs) (aq') = (p's' + dr’ )qs 


and so this is well-defined: it does not depend on the choice of representatives. 
Proposition 2.7.4 (Q,+) is an abelian group. 


Proof This is a matter of straightforward verification. Addition is associa- 
tive, since 


r t s+qr. t su + qru+ qst 
(E+t)+i=? qr t _ psutgqruta 


qd qs uU qsu 
rut+tts r t 
q SU q S wU 


and clearly p/q+r/s = r/s+p/q. The element 0/1 is the identity, since 
0/1 + p/q = p/q+ 0/1 = p/¢ for all (p,q) € Z x Z*. Similarly, 


Dp  =p.- pape = "0 = 0 
+ ad 2 = 549 ) 
q 4 q q 1 


so that (—p)/q is the additive inverse of p/q. 


Next we define multiplication. We define (p/q)(r/s) = (pr)/(qs); once 
again, as the reader should verify, this does not depend on the choice of 
representatives. Let Q* = Q\ {0/1} be the set of non-zero rational numbers. 


Proposition 2.7.5 (Q*,.) is an abelian group, with identity element 1/1. 
The inverse of p/q is q/p. 


Proof The details are left as an easy exercise for the reader. 
Theorem 2.7.6 (Q,+,.) is a field. 


Proof It remains to prove the distributive law: 
p ae _pfru+ts\ —_ pru+pts 
q\s uj) q 8U — gsu 
TU ts rT t r t 
_pru P = 4M _(2) (2)+(2)(<). 
qsu- qsus qs qu q/ \s q u 


We now embed Z into Q. Ifn € Z, let (nr) = n/1. It then follows immedi- 
ately from the definitions that ¢ is an injective homomorphism of the additive 
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group (Z,+) into the additive group (Q,+), and that ¢(mn) = o(m)¢(n) 
for m,n € Z. Summing up: 


Theorem 2.7.7 With addition and multiplication defined as above, Q is a 
field. (Q,+) has identity element 0/1, and the multiplicative identity is 1 = 
1/1. The additive inverse of j/n is (—j)/n; and, ifm €N, the multiplicative 
inverse of m/n is n/m and the multiplicative inverse of (—m)/n is (—n)/m. 
There is an injective map ¢: Z— Q such that ¢(0) = 0, (1) = 1, and 
O(9 + k) = O(9) + O(k), OGK) = O(7)O(k) for all j,k € Z. 


We identify Z with ¢(Z), and consider Z as a subset of the field Q. Thus we 
write n for n/1, so that 0 is the zero element of Q, and 1 is the multiplicative 
inverse. 


Exercises 


2.7.1 Show that there is a field with four elements, and that there is no field 
with six elements. 

2.7.2 Prove the binomial theorem: if F is a field, if x,y € F andifn EN 
then 


n n ts n 
(a+y)” = gt (Tou cot (“)otyls- et ia - ) ry” b+y”. 


2.7.3 Suppose that r = m/n is a rational number in lowest terms, and that 
0 <r < 1. Show that there exists k € N such that 1/(k+1) <r < 1/k. 
Show that if r £ 1/(k+ 1) and r—1/(k+1) = p/q in lowest terms, 
then p < m. Deduce that there exist 1 < ny < ... < ny such that 
r=1/n4+---+1/nt. 

2.7.4 We have adjoined additive inverses to Z* to construct Z, and we have 
adjoined multiplicative inverses to Z* to construct Q. These are spe- 
cial cases of a general construction to adjoin inverses. We need some 
definitions. A monoid is a set S with a binary associative operation 
o: 5 x S — S, together with an element e of S (the identity element) 
for which soe =e0s =e, forall s € S. S is commutative, or abelian, 
if sot = tos for all s,t € S. S has a cancellation law if whenever 
sou=touthen s=t, and whenever wos =uot then s =t. 
Suppose that S' is a commutative monoid with a cancellation law. 

(a) Define a relation on S x S by setting (p,q) ~ (r,s) ifpos=rog. 
Show that this is an equivalence relation on S. 
Let G be the set of equivalence classes. 
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(b) Suppose that g = [(p,q)], h = [(r,s)]. Let g +h = [(pos,q0?)]. 
Show that this is well-defined — it does not depend on the choice 
of representatives. 

(c) Show that addition is associative and commutative. 

(d) Show that (G, +) is an abelian group, with identity |(e, e)] and with 
—[(v, 9] = [(@ P)I. 

(e) Let 0: S — G be defined by 6(s) = [(s, e)]. Show that @ is injective 
and that 6(s ot) = 0(s) + (t). 

(f) Show that G = 6(S) — 0(S). 

2.7.5 Use the results of the previous question to provide another construction 
of (Z,+) from Zt. 

2.7.6 There are circumstances (as in the construction of Q), where in Exer- 
cise 2.7.4 it is natural to denote the composition in G multiplicatively. 

Do this, when S = Z*{z] is the set of non-zero polynomials with 

integer coefficients, and where composition is the multiplication of 

polynomials: 


m+n 


m n 
ifp= y ajx’ and q = Soe then pog= Ss" cpr, 
i=0 j=0 k=0 


where cy = > -{ajb; : t > 0,7 > 0,4+ 9 = k,}. What have you 
constructed? 


2.8 Ordered fields 
We introduce an order on Q. We set j/m < k/n if jn < km. 


Proposition 2.8.1 (i) The relation < is a well-defined total order on Q. 
(i) Ifr<s, thonr+t<s+t for alt€Q. 
(it) Ifr <8, thenrt < st for allt € Q with t > 0. 
(iv) If m,n € Z then m < n in the order on Z if and only ifm <n in 
the order on Q. 


Proof The straightforward verifications are left as an exercise for the reader. 
(Remember that m and n are positive.) 


A field with a total order that satisfies conditions (ii) and (iii) of Propo- 
sition 2.8.1 is called an ordered field. Note that if F’ is an ordered field, and 
f € F, then f? > 0. For if f > 0, then f? > 0, and if f < 0 then —f > 0, 
so that f? = (—f)? > 0. In particular, 1 = 1? > 0. If F is an ordered field, 
and f € F, we say that f is positive if f > 0; we say that f is non-negative if 
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f > 0; we say that f is negative if f <0, and we say that f is non-positive if 
70, 

An ordered field contains a copy of Q as a subfield. We prove this in two 
steps. 


Proposition 2.8.2 If F is an ordered field, there exists a unique injective 
map y:Z— F such that (0) = Or, w(1) = 1p, Y(K +1) = o(kK) 4+ Wl). 
Further, w(kl) = v(k)w(l) for all k,l € Z, and W(k) < wil) fk <1. 


Proof By Proposition 2.5.2, there exists a unique map w : Z — F' such that 
14(0) = Op, (1) = 1p and #(k +1) = #(k) + Y(), for kL Z. 

A straightforward induction then shows that (ml) = v(m)v(l) for 
m € Zt, 1 € Z. Since w((—m)l) + (ml) = v((—m)l + ml) = (0) 
((—m)2) = —U(ml) = —((m)B(D). Since (lm) + (Bm) vO) = 0, 

—(p(m)b() = (dn) )W() = b(—m)H(D). Thus Y(—m)I) = o(—myw), 
and ~(kl) = ~(k)w(l) for all k,l € Z. 

We show by induction that ifm € N then y(m) > Op. The result is true 
if m = 1, by the preceding remark. If it is true for m, then ~(m +1) = 
w(m) + w(1) = o(m) + 1 > Wm) > Op. Thus if k <I then!—k € Zt and 
Wl) — o(k) = wl —k) > 0: o(k) < (1). Further, w is injective, for if k 4 1 
and k <1 then (1) — w(k) = w(l — k) > 0, so that (1) A W(k): similarly, if 
k>l. 


Theorem 2.8.3 Suppose that F is an ordered field. Then there exists 
a unique injective field homomorphism k : Q — F. Further, k is 
order-preserving: if r<s then k(r) < k(s). 


Proof Let w: Z— F be the unique mapping of the previous proposition. 
If 7 € Z, we define k(j) = w(j), and if r = j/n © Q, we define k(r) = 
WI)(Y(n)). Now ¥G)((n))~! = ¥G')(W(n'))* if and only if Y(jn’) = 
w(j)w(n’) = wo) w(n) = v(jn’), and this happens if and only if j/n = j’/n’. 
Thus & is well defined, and is injective. It is a straightforward matter to verify 


that k satisfies the other requirements of the theorem. 


This shows that every ordered field has a subfield isomorphic to Q. Q itself 
has no proper subfield. For every subfield must contain 0 and 1, and so must 
contain (a copy of) Z. Thus it must contain all elements of the form j/n, 
with 7 € Zand n € N. Thus we have the following characterization of the 
rational numbers. 


Corollary 2.8.4 An ordered field F is isomorphic as a field to Q if and 
only if it has no proper subfields. 
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We can therefore take any ordered field with no proper subfields as a model 
for the field Q of rational numbers. 


Exercises 


2.8.1 Suppose that A is a countable totally ordered subset with the inter- 
mediate property (if a < b then there exists c with a < c < b) with 
no greatest or least element. Show that there is an order preserving 
bijection 7: A— Q. 

2.8.2 Give the details of the proof of Proposition 2.8.1. 

2.8.3 (a) Suppose that a,b,v are elements of an ordered field F' and that 
a >v>b>0. Show that ab < v(a+b—-v). 

(b) Suppose that a1,...,a,% are positive elements of an ordered field. 
Let v = (a, +--+: + ax)/k. Use (a) and an inductive argument to show 
that v® > ayaa... ax. 


2.9 Dedekind cuts 


The field of rational numbers is not adequate for our purpose. The ancient 
Greeks recognized the inadequacy of the rational numbers: the length of a 
diagonal of a square is not a rational multiple of the length of a side. 


Proposition 2.9.1 There is no rational number r with r? = 2. 


Proof Suppose that such an r exists; we can suppose that r is positive, and 
that r = m/n in lowest terms. Then m? = 2n?. Since 2 is prime, 2 divides 
m, and so m = 2q for some q € N. Then 4q? = 2n?, and so 2q? = n?. This 
implies that 2 divides n, contradicting the fact that m and n are coprime. 


This result can be extended greatly. 


Theorem 2.9.2 If p is a monic polynomial with integer coefficients, then 
any r € Q which is a root of p must be an integer. 


Proof Ifr #0, let r = j/q, in lowest terms. Then 


n 


p(r) = te + [an—1j” | + Gn—2q9"? 


+--+ ang" 75 + agg”). 


The term in square brackets is an integer, and so therefore is j"/q. Since 
j and q are coprime, g = 1 and r = 7, an integer. 


This result is due to Gauss. 
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Corollary 2.9.3 IfaeN andneéN then the polynomial x” — a has a 
rational root if and only if there exists b€ N such that a = 6”. 


Can we find a number system in which 2 has a square root, which avoids all 
other such anomalies, and which provides ‘a purely arithmetic and perfectly 
rigorous foundation for the principles of infinitesimal analysis’? Richard 
Dedekind, whose phrase this is, was the first person to give a satisfactory 
answer. He found a solution to the problem on 24 November 1858, but did 
not publish his findings until 1872. His essential insight was that a number 
such as V2 or xm could be characterized by the set of rational numbers greater 
than it, and by the set of rational numbers less than it. There are other ways 
of proceeding (see Exercise 3.4.3), but in many respects, Dedekind’s approach 
remains the best way of defining the real numbers, and it is essentially the 
way that we shall follow. As we have seen, the rational numbers Q form an 
ordered field: the order relation and the algebra structure interact. In this 
section, following Dedekind, we use the order structure of Q to define the 
set of real numbers R as a totally ordered set. In the next section, we shall 
extend the algebraic operations of addition and multiplication from Q to R. 

Suppose that (X,<) is a totally ordered set. A non-empty subset A of X 
is bounded above if it has an upper bound in X, is bounded below if it has 
an lower bound in_X, and is bounded if it is bounded above and below. The 
totally ordered set (X,<) is said to have the supremum property or least 
upper bound property if whenever A is a non-empty subset of X which has an 
upper bound then A has a supremum: there exists sup A € X such that sup A 
is an upper bound for A and if b is any other upper bound, then sup A < b. It 
is most important that sup A may or may not be an element of A. We shall 
require R to have the supremum property. This fundamental order property 
is the basis of almost all the analysis that we shall do. 


Proposition 2.9.4 A totally ordered set (X,<) has the supremum prop- 
erty if and only if every non-empty subset B of X which has a lower bound 
has an infimum. 


Proof Suppose first that (X,<) has the supremum property, and that B is 
a non-empty subset of X which is bounded below. Let L be the set of lower 
bounds of B. L is non-empty, and any element of B is an upper bound for 
L. Thus L has a supremum, s, say. We shall show that s is the infimum of 
B. If b € B then 6 is an upper bound for U, and so b > s. Thus s is a lower 
bound for B. If c is a lower bound for B, then c € L, and so c < s; thus s is 
the greatest lower bound of B. 
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Conversely, suppose that the condition is satisfied and that A is a non- 
empty subset of X which has an upper bound. Then an exactly similar 
argument shows that the set U of upper bounds of A has an infimum f, 


and that t is the supremum of A 


We now show that we can embed the ordered field Q of rational numbers in 
an order-preserving way in a totally ordered set with the supremum property. 
This is the key construction. 


Theorem 2.9.5 There exists a totally ordered set (R,<) with the supre- 
mum property, together with an injective order-preserving map: 7: Q—>R 
such that 

(a) if a,b € R anda < b then there exists s € Q such that a < j(s) < b, 
and 

(b) R has neither a greatest element nor a least element. 


Proof We call a subset a of Q a Dedekind cut if it satisfies 

(a) ais non-empty and bounded above, 

(3) ifr €aand s <r then s € a, and 

(y) a does not have a greatest element (if r € a there exists t € a with 
Ce) 

(Dedekind, who considered the pair {a,Q \ a}, used the word ‘Schnitt’, 
which can also be translated as ‘section’, ‘slice’, or ‘intersection’.) As we shall 
see, conditions (a) and (3) say that a is a semi-infinite interval, and condition 
(y) says that a is open. 

Let R be the set of Dedekind cuts. We define an order on R by setting 
a<bifaCb. 

First, we show that this is a total order on R. Suppose that a,b € R and 
that b is not less than or equal to a. Thus 6 is not contained in a, so that 
there exists rin b\ a. If s > r, then s ¢ a, since otherwise r € a, by (3). Thus 
iftea,t<r,andsot e€ b. Hence a < b. 

Next, we show that (R, <) has the supremum property. Although this is 
the essential property of R, the proof is quite straightforward. Suppose that 
A is a non-empty subset of R with upper bound wu. Let us set up = Uge aa. 

First, we show that uo is a Dedekind cut. uo is non-empty, and ug < u, 
and so condition (aq) is satisfied. Suppose that r € uo and that s < r. Then 
r € a for some a € A, and s € a, by (8), so that s € uo. Thus condition 
(3) is satisfied. Further, there exists t € a with t > r, and then t € uo. Thus 
condition (7) is also satisfied. Hence ug is a Dedekind cut. 

Next, we show that up is the supremum of A. If a € A, then a C uo, so 
that uo > a: ug is an upper bound for A. If d is an upper bound for A then 


2.9 Dedekind cuts 69 


d >a for alla € A, so that a C d for alla € A, and so ug = Ugeaa C d. Thus 
d > uo; uo is the least upper bound of A. 

Next, we define the mapping 7 : Q — R. If r € Q, we set j(r) = {s € 
Q:s <r}. Let us show that j(r) is a Dedekind cut. Since Q has no least 
element, j(r) #4 @, and r is an upper bound for j(r), so that condition (a) 
is satisfied. Condition (() is clearly satisfied. If s € j(r), let t = (s+ r)/2. 
Then s <t<r,so that t € j(r). Thus condition (y) is satisfied, and j(r) is 
a Dedekind cut. 

The mapping 7 : Q — R is clearly an order-preserving mapping from 
Q to R, and j is injective, for if r < s then r € j(s) \ j(r), so that 
jr) #518). 

Let us now show that (a) holds. Suppose that a,b € R and that a < b. 
Then there exists r € b \ a. By condition (y), there exists s > r with s € b. 
Then s € b\ j(s) and r € j(s) \a, so that a < j(s) <b. 


Corollary 2.9.6 Ifa ¢R then a = sup{j(r) € 7(Q) : j(r) < a}. In 
particular, if t € Q then j(t) = sup{j(r): 7 < t}. 


Now let us prove (b). Suppose that a € R. If r € a, then j(r) < a, so that 
a is not a least element of R. If s is an upper bound in Q for a, there exists 
t € Qwitht > s. Then s € j(t) \ a, so that a < j(t) and a is not a greatest 
element of R. 


We define the real numbers to be the pair (R, 7). We shall usually identify 
Q with 7(Q). Thus R is a totally ordered set with the supremum property, 
with neither a greatest element nor a least element, which contains Q in 
an order-preserving way, and which has the property that if a,b € R and 
a < b then there exists r € Q such that a < r < b. We shall deduce all the 
properties of R from this. 


Exercises 


2.9.1 Suppose that n € N and that (p/q)? = n. Show that if p— rq 4 0 then 
((nq — rp)/(p —1rq))? = n. Use this to give another proof that if n has 
a rational square root then the square root is an integer. 

2.9.2 Suppose that p is a polynomial of degree n with coefficients in a field 
F’. Show that if c is a root of p in F, then p(x) = (a — c)q(x), where q 
is a polynomial of degree n — 1. Show that p has at most n roots in F’. 

2.9.3 Show that there is an order-preserving bijection j of Q onto Q \ {0}. 
[Hint: use the intermediate property, and define j recursively, using an 
enumeration of Q.] 
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2.10 The real number field 


Let R be the set of real numbers, and let 7 : Q — R be the inclusion 
mapping. So far, we have only established the order properties of R. We 
now define the algebraic properties of R. First, we define the addition of real 
numbers in such a way that (R,+) is an ordered abelian group and j is a 
group homomorphism. 

Ifz € Rlet Diz) = {r € Q: r < gz}. By Corollary 2.9.6, D(z) is a 
Dedekind cut. 


Proposition 2.10.1 Suppose that x,y € R. Then 
D(az)+D(y) = {r+s:r€ D(z),s€ Diy)} ={r+s:r,sEeQ:r<a2,s< y} 
is a Dedekind cut. 


Proof Let us check the conditions. 

(a) The set D(x) + D(y) is not empty, and if r; is an upper bound for 
D(x) and ry is an upper bound for D(y) then rz + ry is an upper bound for 
D(x) + D(y). 

(3) Suppose that r € D(z), that s € D(y) and that t < r+s. Let u=t-s, 
Then u <r, so that u€ D(x). Hence t =u+s€ D(x) + D(y). 

(y) Suppose that w = r+s € D(x)+ D(y), with r € D(z), s € Diy). 
Then there exists r’ € D(x) with r <r’. Then w’ = r’+ 5 € D(x) + D(y) 
and w < w’. 


We define the real number xz + y to be the Dedekind cut D(x) + D(y). 
Then D(z + y) = D(x) + D(y). 


Corollary 2.10.2 Ifz,ye&R andt € Q, and if j(t) < «+ y then there 
exist r,s € Q such that j(r) < x, j(s)<yandt=r+s. 


Proof Fort € D(x+y) = D(«#) + Dy). 


Proposition 2.10.3 Suppose that x,y € R and that r,s € Q. 

i) 2S 7s 

(ii) (et+y)+z=a24+(y+2). 
(iti) « + §(0) = 2. 

(iv) Ifa<y thenz+z<yt+z. 

(v) i(r) + 9(8) = ir + 8). 
Proof (i)—(iv) are easy consequences of the definition, left as exercises for 
the reader. 

(v) We must show that D(j(r))+ D(j(s)) = DU(r+s)). Ift € D(y(r)) and 

u € D(j(s)), thent < rand u < s,so that t+u < r+sandt+u € D(j(r+s)). 
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Thus D(j(r)) + D(j(s)) C DU(r + s)). Conversely, if t € D(j(r + s)) then 
t<r-+s. As in Proposition 2.10.1, there exists u € Q with u < r such that 
t=u-+s. Thust € D(j(r))+D(j(s)), so that D(j(r+s)) C D(j(r))+D(j(s)). 
Hence D(j(r)) + DG(s)) = DU(r + 8). 


We now need to define —z, for  € R, in such a way that x + (—x) = 0. If 
—2 exists, then 


D(-2) = {r€Q:j(r) < -c} ={reQ:x<j(-r)}. 
We therefore define M(x) = {r € Q: 2 < j(-r)}. 
Proposition 2.10.4 Jfz¢éR then M(x) is a Dedekind cut. 


Proof Again, we check the conditions. 

(a) There exists s € Q such that j(s) > 2. Let r = —s. Then x < j(—r), 
so that r € M(x), and M(z) is not empty. Similarly there exists u € Q such 
that j(u) < z. Let ¢ = —u. If r € M(x) then j(—t) = j(u) < x < j(-r), so 
that r < t; t is an upper bound for M(z). 

(3) Suppose that u € M(x). Ift € Q and t < u then x < j(—u) < j(-t), 
so that t € M(x). 

(y) Suppose that u € M(a). There exists s € Q such that x < j(s) < 
—j(u), so that iff = —s then x < j(—t) < j(—u). Thus t € M(x) and 
u<t. 


We now define the real number —x to be M(x). Thus M(x) = D(—2). 
Theorem 2.10.5 Jfx¢R then x+(—x) = j(0). 


Proof First we show that x + (—a) < j(0). Ifr € M(a) and s € D(z) then 
j(s) < @ < j(—-7r), so that r+s < 0, andr+s € D(j(0)). Consequently, 
a+ (—2) < 7(0). 

Secondly, we show that «+(—2) > j(0). Suppose that t € D(j(0)), so that 
t € Qandt < 0. There exists r € Q such that x + j(t) < j(r) < x. Thus 
x<j(r—t) = j(-—(t—1)), so that t—r € M(a). Since j(r) < a, r € D(z). 
Thus t € D(x) + M(x) = D(x) + D(—xz) = D(a + (—2x)). Consequently 
D(j(0)) S D(@ + (—#)), and so # + (—2#) > j(0). 


Thus R is an ordered abelian group under addition, with identity ele- 
ment j(0), and the map r — j(r) is an order-preserving injective group 
homomorphism of Q into R. 

We now turn to multiplication. Here it is easiest first to define the product 
of two non-negative elements of R, and then extend to the whole of R, just as 
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we did for Z. As we shall see, the programme is very similar to the programme 
for defining addition, and we shall therefore omit many of the details. If 
x € Rand z > 0, then the Dedekind cut D(x) contains all the negative 
rational numbers, and we wish to avoid negative numbers. We therefore define 
a positive Dedekind cut to be a non-empty subset at of {r € Q: r > 0} which 
is bounded above, does not have a greatest element, and has the property 
that whenever r € at, s€ Qand0 <s <r then s € a’. If at is a positive 
Dedekind cut, then a = at U{r € Q: r < 0} is a Dedekind cut. 

Ife e€ Rand z> 0, let Dt (xz) = {fr €Q:0<r <2}. Then D(z) isa 
positive Dedekind cut, and x = sup(Dt(z)). If x, y are positive real numbers, 
we define Dt(x).D*(y) to be {t € Q: t = rs forsomer € D*(x),s € 
D*(y)}. 


Proposition 2.10.6 Suppose that x,y are positive real numbers. Then 
Dt (x).D*(y) is a positive Dedekind cut. 


Proof Just like the proof of Proposition 2.10.1. 


Thus D = Dtz.DtyU{r € Q: r < 0} is a Dedekind cut. We set 
ay = oy =D, Then Day) = D = De.Dy. 
Corollary 2.10.7 Suppose that x,y are positive real numbers, that t € Q 


and that 0 < j(t) < xy. Then there exist r,s € Q such that 0 < j(r) < 2, 
0<j(s)<yandt=rs. 


Proof For t € Dt(xy) = Dt(x)D*(y). 


Proposition 2.10.8 Suppose that x,y,z are positive real numbers and 
that r and s are positive rational numbers. 


(i) xy = ys. 

(ii) (ay)z = x(yz). 
(iti) j).a =e. 

(iv) Ifa <y then xz < yz. 

(v) c(y+z) =xy+ 22. 

(vi) j(rs) = j(r).J(s). 
Proof The proofs of (i)—(iv) follow from the definitions. 

(v) This is also easy, but here are the details. We need to show that 


D*(2).(D*(y) + D*(z)) = (D* (x).D* (y)) + (D* (z).D* (z)). 
Clearly 


D*(x).(D*(y) + D*(2)) € (D*(a).DT(y)) + (D*(2)-D* (2). 
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Suppose that rs + r’t € (Dt(ax).Dt(y)) + (D*(x).D*(z)). Let r” 


max(r,r’). Then r’(s + t) € D*(a).(DT(y) + D*(z)) and 0 < rstr't < 
r’(s +t), so that rs + r’t € D*(x).(Dt(y) + D*(z)). Thus 


(D*(2).D"(y)) + (D*(a).D*(2)) € D*(a).(D*(y) + D¥ (2). 


(vi) Just like the proof of Proposition 2.10.3 (v). 


Suppose that « € R and that x > 0. We want to show that x has a 
multiplicative inverse 2~!. Following the ideas behind the construction of 
additive inverses, we get 


I(x) ={r€Q:2<j(1/r)} = {x EQ: j(r)x <1}. 


Proposition 2.10.9 JfxzéR andxz>0 then I(x) is a Dedekind cut. 


Proof Just like the proof of Proposition 2.10.4. 


We now define z~! to be I(x). Thus I(x) = D(x~*). 
Theorem 2.10.10 IfxeER thenav.at=ax 1.x = j(1). 


Proof Just like the proof of Theorem 2.10.5. 


Thus {x € Rt : x > 0} is an abelian group under multiplication, and «~! 
is the multiplicative inverse of x; we also write it as 1/2. 

We now extend multiplication to R and multiplicative inversion to R* = 
R \ {0}. If z,y € R*, we set (—x)y = x(—y) = —(ay) and (—2x)(—y) = zy, 
and if x > 0, we set 1/(—x) = —(1/z). 

With these definitions, R becomes an ordered field with the supremum 
property. The mapping j7 : Q — R is an order-preserving field isomorphism 
of Q onto a subfield of R, which we shall now identify with Q. The elements 
of 7(Q) are rational numbers; the elements of R \ j(Q) are called irrational 
numbers. 

We shall show in Theorem 3.3.1 that any ordered field with the supremum 
property is isomorphic as an ordered field to the ordered field R of real 
numbers, and that the isomorphism is unique. 

After all this work, we should verify that we can use the real numbers R 
to solve the problem that we raised at the beginning of the previous section. 
In fact we can say more. 


Theorem 2.10.11 Suppose that y is a positive real number and that 
neéN. Then there exists a unique positive real number s such that s” = y. 


We need the following lemma. 
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Lemma 2.10.12 (i) Suppose that 0 <¢ <1 and thatne N. Then 


(l—ne) < (1—€)” < (1 +6)” <14+(2"- Le. 


(it) Suppose that a and b are positive real numbers and thatn € N. Then 
a” > b” if and only if a > b. 


Proof (i) The proof is by induction on n: the result is true if n = 1. Suppose 
that it is true for n. Then 
de) h= (1S) Ss eet =ne) 
—(n+ljetne? >1-—(n4+1)e 
1+e)(1+e)” < (1+ e)(1+4+ (2 — 1)e) 
+ (2% — le te + (2” — 1)e?) < 1+ (2771 — Ie. 


and (1+ ¢)?*1 = 


( 
1 
( 
1 


(ii) This follows from the equation 


=) S(os a ae hs ee Ee). 


Proof of Theorem 2.10.11 Let B= {xeER: 2” < y}. Since 0 € B, B is 
non-empty. If y < 1 then B is bounded above by 1. If y > 1 then y” > y, so 
that if « € B then 2” < y” and x < y, by the lemma. Thus B is bounded 
above by y. Therefore B has a supremum s, say. We shall show that s” = y. 
There are three possibilities; either s” < y or s” > y or s” = y. We shall 
show that the first two of these cannot occur, so that s” = y. 

Suppose first that s” < y. Choose 0 < n < (y—s”)/(2" — 1)y. Note that 
0< <1. By the lemma, 


((1+m)s)" <8" + (2" — Ins” < 8" + (y— 8") =y, 


so that (1+ 7)s € B. Since (1+ 7)s > s, this contradicts the fact that s is 
an upper bound for B. 

Secondly, suppose that s” > y. Choose 0 < 6 < (s” — y)/ns". Note that 
0<é@<1.If2¢€ B then 


((1 — 6)s)” — a2” > (1 — n6)s” — y = (s” — y) — nés” > 0. 


By the lemma, (1 — @)s > 2, so that (1 — @)s is an upper bound for B, 
contradicting the fact that s is the least upper bound of B. 

Consequently, s” = y. Finally, s is unique. For if t” = y, then s”—t” = 0, 
and it follows from part (ii) of the lemma that s = t. 
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The number s is denoted by y!/”: it is the nth root of y. 

This proof is all very well, but it is very cumbersome. Surely there is a 
better proof! There certainly is, but it requires us to do a good deal of analysis 
first. In due course, we shall see that this result is an easy consequence of the 
intermediate value theorem (Theorem 6.4.1). 


Part Two 


Functions of a real variable 


3 


Convergent sequences 


3.1 The real numbers 


At the beginning of the nineteenth century, it became clear that mathemat- 
ical analysis (the study of functions and of series) lacked a satisfactory firm 
foundation. In 1821, Augustin-Louis Cauchy published his Cours d’Analyse, 
which contained the first rigorous account of mathematical analysis. Cauchy 
however took the properties of the real numbers for granted. In 1858, when 
Richard Dedekind was preparing a course of lectures on the elements of the 
differential calculus at the Polytechnic School in Ztirich, he ‘felt more keenly 
than ever the lack of a really scientific foundation for arithmetic’, and dis- 
covered the construction of the real number system that is described in the 
Prologue. In fact, he only published his results in 1872.! With hindsight, it 
has become clear that the properties of the real number system lie at the 
heart of all mathematical analysis, and that it is essential to obtain a full 
understanding of these properties in order to develop mathematical analysis. 

In the Prologue, we have constructed Dedekind’s model for the real num- 
bers R and established some of its properties. It is however sensible to take 
the construction for granted, to write down the essential properties of R, 
and to use these properties to develop the theory of mathematical analysis. 
This we shall do. 

What are the essential properties of R? First, R is a field: that is, addition, 
multiplication, subtraction and division have been defined to satisfy the 
usual conditions of arithmetic. 

Secondly, there is a total order on R: if x,y € R then either x < y or 
y <<, and both occur if and only if x = y; further if « < y and y < z then 
x < z. The order makes R an ordered field; if x < y then 1+ 2 < yz, 


1 See Richard Dedekind, Essays on the Theory of Numbers, Dover, 1963. 
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and if « < y and z > 0 then xz < yz. R contains a copy of the set of 
rational numbers Q, and, within it, copies of the integers Z and the natural 
numbers N. 

The set Q of rational numbers is also an ordered field, but it is not 
adequate for analysis. The fundamental property of R that we shall use 
over and over again relates to the order structure of R. If A is a non-empty 
subset of R and b € R, then 6 is an upper bound for A if b> a for alla € A. 
Then R has the supremum property: if A is a non empty set of R with an 
upper bound, then A has a least upper bound, or supremum sup(A): there 
exists an upper bound c € R such that if b is any other upper bound of A 
then c < b. It is important that the supremum of a non-empty set may or 
may not belong to A. For example, if R~ = {a € R: x < 0} is the set of 
negative numbers, then sup(R~) = 0, andO0 ¢R-. 

The supremum property is equivalent (Proposition 2.9.4) to the require- 
ment that every non-empty subset A of R which is bounded below has a 
greatest lower bound, or infimum. For the set B of lower bounds of A is non- 
empty and bounded above by any element of A, and so B has a supremum, 
s, say. We show that s is the infimum of A. Suppose, if possible, that a € A 
and that a < s. Then a is not the least upper bound for B, and so there 
exists b € B with a < b < s. But then 0 is not a lower bound for A, giving 
a contradiction. Thus a > s, and so s € B. If c > s then c is not a lower 
bound for A, and so s is the infimum of A. 

Here is a first application of the supremum property. 


Theorem 3.1.1 (i) Let J = {1/n: n € N}. Then 0 is the infimum 
of J. 

(ii) Ifx € R and x > 0 then there exists n EN with n> x. 

(it) If x <y then there exists r € Q such thatu<r<_y. 


Proof (i) 0 is a lower bound for J, and so | = inf J exists, and | > 0. 
Suppose, if possible, that | > 0. Then 2/ > /, and so 2/ is not a lower bound 
for J. Thus there exists n € N such that 1/n < 21. But then 1/2n < l, 
giving a contradiction. 

(ii) 1/x > 0, so that 1/z is not a lower bound for J. There exists n € N 
such that 1/n < 1/x. Then n > x. 

(iii) First, we prove this in the case where 0 < x < y. By (i), there exists 
n € N such that 1/n < y—a@. Let A= {k € Z* : k < na}. A is non- 
empty (0 € A) and finite, by (ii), and so it has a greatest element a. Then 
a<nzx <a+l1, so that if we set r= (a+ 1)/n, then x < r. On the other 
hand r=a/n+1/n<2+(y-—2z)=y. 
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If x < 0 < y, we can take r = 0; if x < y < 0, the result follows by 
considering —y and —2. 


Statement (i) says that there are no infinitesimally small members of R™, 
and statement (ii), which is known as the Archimedean property, says that 
there are no infinitely large members. 

Statement (iii) is an existence statement; when x > 0, it is desirable to 
give an explicit procedure for determining a rational r with x < r < y. There 
exists a least positive integer, qo, say, such that 1/qo < y—«, and there then 
exists a least integer, po, say, such that x < po/qo. Then x < po/qo < y 
and ro = po/qgo is uniquely determined. Let us call ro the ‘best’ rational 
satisfying r<r<y. 

Suppose that « € R. We set 


gt=2 xt =0 
xz =0 if~>0,and x2 =-2z ifx <0. 
gh=ar = a 2a" Se 


Then x = at —a~ and |z| = a2t+27 >0. |2| is the modulus, or absolute 
value, of x; it measures the size of x. Note that if one of x,y is positive and 
the other negative then |x + y| < |x| + |y|; otherwise |x + y| = |x| + |y|; 
note also that |z|.|y| = |ay|. We set d(x, y) = |x — y|; d(x, y) is the distance 
between x and y. 


Proposition 3.1.2 If x,y,z €R then 


(i) d(x, y) = dy, x); 
(it) d(x, y) = 0 tf and only if x = y; 
(itt) d(x, z) < d(x,y) + d(y, z) (the triangle inequality ). 


Proof (i) follows from the fact that || = |—.|, and (ii) from the fact that 
|x| = 0 if and only if x = 0. Finally, 


lje=2|=|@=9) + y= 2)|'s< le—yl + ly =<: 


The function d: R x R — Ris a metric on R. We shall study more 
general metrics in Volume II. 

The problem of the existence of the square root of 2 arose as a problem in 
geometry. It is natural to think of the set R of real numbers geometrically, 
and to think of them as points on a line, the real line, arranged in order. 
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Figure 3.1. The real line. 


If a € R, we can then consider the mapping « — x +a as a shift, or 
translation, shifting everything an amount a to the right, if a > 0, and an 
amount |a| to the left, if a < 0. The mapping « — —z is a reflection about 
0. If a > 0 then the mapping x — az is a dilation or scaling, scaling x by a 
factor of a. 

The totally ordered set R does not have a greatest or least element. It is 
sometimes convenient to add two points +oo and —oo, to obtain the extended 
real line R. Thus R = {—co} URU {+00}. The order is extended by setting 
—oo < x < +00 for every real number x. Then R is order complete — every 
non-empty subset has an infimum and a supremum. If A is a non-empty 
subset of R then inf A = —oo if and only if A does not have a lower bound 
in R, and sup A = +o if and only if A does not have an upper bound in R. 

Some care must be taken in extending the algebraic operations from R 
to R. Common sense and prudence suggest the following rules. 


If ce € R, then (+00) +4 = 2+ (+00) = x — (—co) = +00, 
(—oo) + © = 24+ (—oo) = 


z/(+00) = z/(—oo) = 0. 


(+00) + (+00) = +00 and (—oo) + (—oo) = —o0. 
The sums (+00) + (—oo) and (—oo) + (+00) are not defined. 


(—co).2 = x.(—o0) = —oo, and «/0 = +00. 


(—oo).2 = x.(—0o0) = +00, and 2/0 = —oo. 
The products 
0.(+00), 0.(—co), (+00).0 and (—oo).0 
and the quotients 
(+00)/(+00), (+00)/(—00), (—00)/(+00), (—00)/(+00), 
0/0, (+00)/0 and (—oo)/0 


are not defined. 


3.1.1 


3.1.2 


3.1.3 


3.1.4 


3.1.5 


3.1.6 


3.1.7 
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Exercises 


Show that the sum of a rational number and an irrational number is 
irrational. What about the product? 

Suppose that r,7r’ are two rational numbers, with r < r’. Show that 
there exists an irrational number x with r < x <r’. 

Suppose that A and B are non-empty subsets of R which are bounded 
below. Let A+ B={x €R:x=a+b6 for some a € A,b€ B}. Show 
that A+ B is bounded below and that inf(A + B) = inf A+ inf B. 
What about products? 

Suppose that (an)?2., and (bn)?2, are sequences in R such that the 
sets {a,,:n € N} and {b, :n € N} are bounded above. Show that the 
set {an + bn :n € N} is bounded above. Is 


sup{dy + by :n € N} = sup{an :n € N}+sup{b, :n © N}? 


Let Q(V/2) be the set of all real numbers of the form r+ sV2, with 
r,s € Q. Show that Q(V2) is a subfield of R. Show that there are two 
total orderings of Q(/2) under which it is an ordered field. 

Suppose that a,,...,a@, and b,,...,6, are real numbers. Establish 
Lagrange’s identity: 


n 2 n n 
(>: ot] + S> (aid; _ ajbi)* = (>: «) (s ®) : 
1=1 (iai<j} i=l i=l 


Deduce Cauchy’s inequality: 


1 1 
n n 2 n 2 
ews (Ee) (Ee) 


with equality holding if and only if a;b; = aj;b; for 1 <i,j <n. 
Ifae Rand keEN, define the binomial coefficient to be 


(*) de Dita te) 


Prove de Moivre’s formula 


Gono 
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and its generalization, Vandermonde’s formula, 
k 
a+ B = S- a B 
ko) SAG) Me) 


(Hint: First suppose that 3 € N. Each side of the equation is a polyno- 
mial in a, and the two polynomials take the same values when a € N. 
Now repeat the argument for (.] 
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Let us now look again at statement (i) of Theorem 3.1.1. 


Theorem 3.2.1 Jf «> 0 then there exists no such that 0 < 1/n < € for 
n> no. 


Proof Since! > 0 and 0 is the infimum of J = {1/n:n € N}, there exists 
no such that 1/no < €. If n > no, then 0 < 1/n < 1/no < «. 


This suggests the following definition for more general sequences than 
(1/n)°<,. (We consider sequences indexed either by N or by Z™, the set of 
non-negative integers — since we are concerned with what happens for large 
values of n, there is no real difference between the two cases, and we shall 
only state and prove results in one or other case.) A real-valued sequence 
(Gn)°9 converges to 0 as n tends to ov, or tends to 0 as n tends to oo, or 
is a null sequence, if whenever € > 0 there exists no (which usually depends 
on €) such that |an| < € for n > no. 

A couple of remarks are in order. First, the condition concerns the size |a;| 
of ay, rather than a, itself. We can write the condition as —e€ < ay, < e€ for 
n > no; thus a, has to satisfy two inequalities, and sometimes it is necessary 
to consider the two inequalities separately. Secondly, the sequence (|a@n|)°29 
need not be decreasing. As an example, if we set a), = (—1)"/n+1/n? then 
(dn)nen is a null sequence, although a, is, after the first term, alternately 
negative and positive, and although |a,,| — |an+1| is alternately negative and 
positive . 

We can immediately generalize this definition. Suppose that 1 © R. We 
say that a real-valued sequence (a@;,)°°.9 converges to l, or tends to l, as n 
tends to co if whenever € > 0 there exists no (which usually depends on «) 
such that |ap, —1| < € for n > no. In other words, the sequence (a, —1)°° 9 is 
a null sequence. Once again, we can split the definition into two: we require 
that 1—€ <a, <l+e, for n> no. 
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If (an) °°.9 converges to 1, we say that / is the limit of the sequence, and we 
write ‘a, — las n — oo’, and write / = limy,_... ay. A warning: we can only 
write | = limn_.o. Gn if we know that the sequence has a limit; and many 
sequences do not have a limit. 


an 


Figure 3.2. A convergent sequence. 


When they exist, limits are unique. 


Proposition 3.2.2 Ifa, — 1 as n — oo and a, > m as n — oo, then 
l=m. 


Proof Suppose not. Let € = |! — m|/3, so that € > 0. There exists no such 
that jan — 1| < € for n > no, and there exists mo such that |an — m| < € for 
n > mo. Let po = max(no, mo). Then if n > po, 


|2 —m| < |an — l| + lan — m| < 2€ = 2/1 — m|/3, 


giving a contradiction. 


A subset B of R is bounded if it is bounded above and bounded below. A 
sequence (a,)°29 is bounded if the set of values {a, :n € Zt} is bounded. 


Proposition 3.2.3 If a, — | as n — oo then (an), is bounded. 


Proof There exists no such that |a, — l| < 1 for n > no. Let M = 
max{|ai|,|@2|,---,|@no|,|!] +1}. If m > no then |an| < Jan — 2] + |! < ll] +1, 
so that |a,| << M for all n. 


Unbounded sequences can behave in many different ways: we pick out 
two where the behaviour is quite respectable. We say that a, — +00 as 
n — oo if whenever M € R* there exists m9 (which usually depends on M) 
such that a, > M for n > no, and that a, — —oo as n — oo if whenever 
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M € R* there exists no (which usually depends on M) such that an < —M 
for n > ng. 
We now come to the most important result of this section. 


Theorem 3.2.4 Suppose that (an)°9 is an increasing sequence of real 
numbers. If (an)° 9 ts bounded, then a, — sup{a, :n € Zt} as n > ~w; 
otherwise an — +00 as n — oo. 

Suppose that (an)°°9 is a decreasing sequence of real numbers. If (an)? 
is bounded, then an — inf{an:n € Zt} as n > ov; otherwise an — —0o as 
n— oO. 


Proof Suppose that (a,)°°9 is increasing and bounded. Let | = sup{ay : 
n € Zt}. If e > 0 then J — « is not an upper bound, and so there exists 
no such that a,, > 1 —. Since / is an upper bound, and since (ap)nen is 
increasing, 1—€ < Gn, < Gn <1 for all n > no, so that ja, —1| < € for n > no. 
Similarly if (an)nen is increasing and unbounded and M € Rt, then 
there exists no such that a, > M; then an > an, > M for n > no. 
Exactly similar arguments work for decreasing sequences. 


Why is this so important, when the proof is so easy? First, it provides 
us with a rich supply of convergent sequences. Secondly, it is used in an 
essential way to prove further deep results. In fact, the results can be taken 
to provide another characterization of R, as Exercise 3.2.16 shows. 

The notion of convergence fits in well with the algebraic and order 
structure of R, as the following collection of results shows. 


Theorem 3.2.5 Suppose that (dn)?29 and (bn)P29 are sequences of real 
numbers. 


(i) If an =1 for all n, then ap, > 1 as n > ov. 
(ii) If (an)? ts a null sequence, and (by)? is bounded, then (anbn)?o 
is a null sequence. 
(itt) If an > a and b, > b as n — co then an + bn 7 a+b as n— oo. 
(iv) If On > a asn— co andceéR then can, — ca as n — oo. 
(v) If an > a and bh — b as n > co then anby — ab as n => ov. 
(vi) If an #0 anda # 0 and an — a as n > oc then 1/a, — 1/a as 
no. 
(vit) If an > a and by > b as n > 00 and ayn < by for all n then a < b. 
(viii) If an > a asn— oo and if (an, )729 is a subsequence, then an, — a 
as k — oo. 
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Proof The proofs are straightforward. We give details, but will subse- 
quently leave similar proofs to the reader. 


(i) is quite trivial: for any « > 0, take no = 0. 
(ii) There exists M > 0 such that |b,| < M for all n. Given € > 0, there 
exists no such that |a,| < €/M for n > no. Then |a,b,| < € for n > no. 
(iii) Given €« > 0, there exists no such that |a, — al < €/2 for n > no, 
and there exists mo such that |b, — b| < €/2 for n > mp. Let po = 
max(no, mo). Then if n > po, 


\(Qn + bn) — (a+ b)| < Jan — a] + |b, — B] <e. 


(iv) Given € > 0, there exists ng such that ja, —a| < €/(|c| +1) for n > no. 
Then |cay, — cal < € for n > no. 

(v) anbp — ab = (an — a)(by, — 6) + (an — a)b + a(b, — b). The sequence 
(bn — b)°°_9 is a bounded sequence, by Proposition 3.2.3; since (an — a) 
is a null sequence, the sequence ((a;— a) (by —b))?°9 is a null sequence, 
by (ii). The sequences ((a@, — a)b)°°.9 and (a(bn — b))°2 9 are also null 
sequences, by (iv), and so the result follows, using (iii). 

(vi) There exists no such that |a,—a| < |a|/2 for n > no, so that if n > no 
then |a,| > |a| — |an — a| > |a|/2. Thus if n € Z* then 


|1/ana] < max(1/laa],...1/|an,a|,2/|al*), 


so that the sequence (1/ana)?°9 is bounded. Since 1/a, — 1/a = (a — 
An) /Gna, the result follows from (ii). 

(vii) We argue by contradiction. Suppose that a > b. Using (iii) and (iv), 
An — by + a—b as n — co, and so there exists no such that |(a, — 
bn) — (a — b)| < a—b for n > no. But this implies that a, — bp, > 0, 
giving a contradiction. Thus a > b. 

(viii) Given € > 0 there exists N such that |a, — a| < € for n > N, and 
there exists ky such that n, > N for k > ko. Thus if k > ko then 
lan, — al <€. 


As an example, if 0 < r < 1, then the sequence (r”)°°_, is a decreasing 
sequence, bounded below by 0, and so it converges to a limit l, say, by 


Hel oe] as 


Theorem 3.2.4. Then r™* — rl as n — oo, by (iv), and r 
n — oo, by (viii). Thus rl = 1, by Proposition 3.2.2, and so 1 = 0: (r”)?2.4 is 
a null sequence. So also is (r”)?°.9, for —1 < r < 0, by (ii). 

Some care is needed using (vii). Suppose that a, — a and b, — bas 
n — oo and that a, < b, for all n. Then it does not follow that a < b. As an 


example, consider the sequences (r”)°°_, and (—r”)°2.5, where 0 <r <1. 
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As another example, let us give another proof of Theorem 6.4.6: if y is 


a positive real number, if k © N and if k > 2, then there exists a unique 


positive real number s such that s* = y. We write s = y'/*, 


Let a9 = max(1,y), so that ak > y. We show that we can define the 


sequence (Gp,)°29 recursively by setting 


1 Ke 
On+1 = 7 (( lag 4) = Gy (1 = it) fornéeN, 


and that 0 < an41 < ay and y < a*. Suppose that have defined a,, and 
shown (if n > 0) that 0 < a, < Gn_; and y < a Since ap, > 0, Gn4i is 


properly defined. Since kak > ak — y > 0, 0 < an41 < Gp. In order to show 
that a* 41 2 Y, we use the following inequality, proved in Lemma 2.10.12: 


if0 <t<1 then (1—?#)* > 1-— kt. 


Thus 


k k any : k ak —y 
F nm 
ok = ak (1 - ak ) > of (1- EH) <y 


n 


Since the sequence (a,,)?2_9 is bounded below, it follows that it converges to 
a limit 1 as n — oo. Using Theorem 3.2.5, it follows that I* > y, so that 
1 > 0, and it then follows that 


a ayn {1 a, —y Lt 1 Lis y as nm — oo 
= . - 
n+1 n kak klk 


Since an > 1 as n > oo, it therefore follows from Proposition 3.2.2 that 


[Foy 


so that l* = y. If l* = m* then 


6 Pf =m =m) em ay, 


so that | =m. 

This may seem to be a proof that is too complicated to be interesting. 
But it is not. It is an example of the use of the Newton—Raphson method: 
the sequence (a;,)°2 converges very rapidly. Thus it not only proves the 
existence of y!/*, but also enables good approximations to it to be calculated. 
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Here is an easy but useful result. 


Proposition 3.2.6 (The sandwich principle) Suppose that an < by < Cn 
for all n, and that an > 1 and cq, > 1 asn — ew. Then by, — 1 asn—- ow. 


Proof Given € > 0 there exists no such that |a, — 1| < € and |c, —1| < € 
for n > no. Thus 


l—€ < an < bn Sc, <I +e for n> nQ9, 


so that b, — las n— o. 


Corollary 3.2.7 Jf « © R, there exist a strictly increasing sequence 
(7n)°, of rational numbers and a strictly decreasing sequence (sp)°°, of 
rational numbers such that rn — x and 8); — © asn— Oo. 


Proof Using the notation following Theorem 3.1.1, let 71 be the ‘best’ 
rational with  —1 < ry < x and let s; be the ‘best’ rational with x7 < 51 < 
x+1. Arguing recursively, let r,, be the ‘best’ rational with 


max(% — —,Tn—1) <Tn <4, 
n 
and let s,, be the ‘best’ rational with 
; il 
x < S, < min(x + —, Sp_}). 
n 


Then (rp)°2, is a strictly increasing sequence and (s,,)°2, is a strictly 
decreasing sequence. Since 


1 1 
E-—<Mm<S8<24+-, 
n n 


1 2 « and sy — x as n > ~, by the sandwich principle. 


The next result shows that to test for convergence we need only consider 
a sequence of values of e. 


Proposition 3.2.8 Suppose that (€,)?2, is a null sequence of positive 
numbers. Then a sequence (adn)°29 converges to l if and only if for each 
k there exists nz, such that |ay, —1| < ex forn > ng. 


Proof The condition is certainly necessary. If it is satisfied, and if € > 0, 
then there exists k € N such that 0 < ex < «. If mn > ng, then ja, — 1 
< ER Se. 


It is often convenient to take €, = 1/k, or to take e, = 1/2*. 
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Exercises 


3.2.1 Suppose that a > b > 0. Find limy;_..(a™ — b”")/(a" + 6”). 

3.2.2 Show that /n+1—/n—0asn— oo. 

3.2.3 Does n100000 /2” converge, as n —> 00? 

3.2.4 Suppose that « € R and that x > 1, and that k € N. Does 2”/n* 


converge? 


3.2.5 Suppose that —1 < 2 < 1 and that a € R. Show that 


x —7O0asn—-o. 
n nl 


(\soderus Gen) . 


3.2.6 Suppose that « € R. Show that x"/n! — 0 as n — oo. 
3.2.7 Let an = Vn? +n—Nn, for n € N. Show that a, converges as n — oo. 


3.2.8 Let an = ~ ni (l/s) and let b, = ee n(1/j). Show that (an)?2 


What is the limit? Is the sequence (an) , monotonic? 


n=1 
is an increasing sequence, that (b,,)°°., is a decreasing sequence, and 
that they tend to a common limit as n — oo. 


3.2.9 Use a calculator, and the method described, to calculate 2!/° to 5 


decimal places. 


3.2.10 Suppose that a1,...,@, are positive real numbers. Let A = (a, + 


3.2.11 


3.2.12 


-+ dn)/n be the arithmetic mean and let G = (a,a2... Gn)!” be 
the geometric mean. Suppose that a,,...,@, are not all equal. Show 
that there exist 1 < 7,7 <n such that a; > A and a; < A. Show that 
aia < A(a; + ay — A). 

Let aj = A, a = a; + aj — A and let a, = ax for k 47,7. Let A’ and 
G’ be the corresponding means. Show that A’ = A and G’ > G. 
Show by induction on |{7 : a; £4 A}| that A > G, with equality if and 
only if aj = ag = ++: = Gy. (The arithmetic mean-geometric mean 
inequality. ) 

Use the arithmetic mean-geometric mean inequality to establish the 
following results. 

a) If nt > —1 then (1 —#)” >1—-nt. 

b) If —-~<n<mthen (1+2/n)" < (1+2/m)™ 

c) If x > 0 then (1—2/n)” converges to a positive limit, as n — oo. 
d) If > 0 then (1 — 2/n?)" — 1 as n > on. 

(e) If > 0 then (1+ 2/n)” converges to a finite limit, as n — oo. 
Suppose that —1 <¢ <1. Define ¢, recursively by setting to = 0 and 
tr = tn-1+5(t—tn_1)?. Show that 0 < thi < tn < tl, for alln EN. 
What is limp. tn? 
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3.2.13 Suppose that 0 < ag < bo. Define a, and by, recursively by setting 
An, = 2An—1bn-1/(An—-1 + bn—1) and by = (An—1 + bn_-1)/2. Show that 
An—1 < An < by < bn—1. Determine limy_,o5 Gy and limy_.o6 by. 

3.2.14 Suppose that 0 < ag < bo. Define a, and by, recursively by setting 
An = 1/An—10n—1 and by = (Gn—1 + bn_1)/2. Show that an_1 < an < 
bn < bn—1, and show that the sequences (a;,)°2.9 and (b,)?2.9 tend to 
a common limit as n > oo. 

3.2.15 Let Ry = Fr4i/Fn, where F, is the nth Fibonacci number, and 
n > 1. Show that (Ran_1) is an increasing sequence and that (Ran) 
is a decreasing sequence. Show that R, tends to a limit as n — ov, 
and find the limit. 

3.2.16 Give an example of a sequence (x,,)°°., such that x, converges as 
n — oo for all k > 2, whereas x, does not converge as n — oo. 

3.2.17 Suppose that (#,,)°°, is a sequence such that each of the subsequences 
(tan)ec 4, (Tan41)92, and (x3n)°2, converges as n — oo. Show that 
the sequence (x,,)°°_, converges as n — oo. 


3.3 The uniqueness of the real number system 


In the Prologue, Dedekind cuts were used to construct the real number 
system R. As Exercise 3.3.2 shows, there are other ways of constructing the 
real numbers, and we need to show that the outcome is essentially the same. 

First, let us introduce some terminology. Suppose that zx is a positive real 
number. We set |x| = sup{n € N: n < cz} and {x} = x — [a], so that 
x= |x|+ {x}, and 0 < {x} <1. |a| is the integral part of x, and {x} is the 
fractional part of x. The latter is not only bad notation (the context should 
however make it clear when {x} is being used for the singleton set) but also 
bad terminology, since ‘fractional’ suggests incorrectly that {x} must be a 
rational number. 


Theorem 3.3.1 Suppose that R’ is an ordered field with the supremum 
property. There exists a unique bijection 7 : R — R’ such that if x,y € R 
then 

(i) (a+ y) = F(x) + Jy) and j(xy) = j(x)7(y), and 

(it) if x <y then j(x) < j(y). 


Proof We use the fact that each of R and R’ contains a copy of the 
rational numbers (Theorem 2.8.3). If r is a rational in R, let ja(r) be the 
corresponding rational in R’. 

If x is a positive element of R, we set 2, = |2”x|/2”. Then (a#p)°29 is an 
increasing sequence of rationals in R, bounded above by || + 1. Further, 


92 Convergent sequences 


0< 2-2 < 1/2”, so that zn — x as n — oo. The sequence (jq(%n)?-9 
is an increasing sequence of rationals in R’, bounded above by ja({x| + 1), 
and so it converges, by Theorem 3.2.4 (which can clearly be applied to R’). 
We set j(x) = limp—soo JQ(%n). Note that j(x) > 0, and that if « € Q then 
ja(tn) > ja(x), so that j(x) = jg(a). If x < 0, we set 7(x) = —j(—2). 

If x and y are positive elements of R, and n € Zt then 


iQ((t + y)n) — JQ(Ln) = JQ(Yn)| = \(@+Y)n — Ln = Yn 
= (|x + y) ~ (x + y)nl + |x _ Ln| + ly - Yn| < 3/2”, 


so that 
ila + y) ~ (2) ~ Jy) = lim (iq (# + wn) ~ fqn) — J@Qn)) = 0. 


Thus j(z+y) = j(x)+ j(y). A similar argument shows that j(xy) = j(x)7(y), 
and it then follows easily that (i) holds for all x and y in R. 

If « < y then there exist rationals r and s such that «<< r<s < y then 
g(x) < j(r) < j(s) < jy), and so (ii) holds. It follows from this that 7 is 
injective. 

Using the same procedure, we construct a mapping k : R’ > R for which 
the results corresponding to (i) and (ii) hold. If  € R and x > 0 then 

k(j(@)) = lim k(j(@)n) = Tim ey =e. 
If « < 0 then k(j(x) = —k(—j(x)) = —k(j(—2)) = —(—a) = a. A simi- 
lar argument shows that j(k(a’)) = 2’ for all 2’ € R’. Thus j and k are 
bijections. 

Finally, we show that j is unique. Suppose that j, : R — R’ is another 
mapping satisfying (i) and (ii). Then 7;(0) = 0 and j1(1) = 1, from which it 
follows that j1(r) = j(r) for r a rational in R. If x is a positive element of 
R, then j1(%n) > ji(@) as n > oo. But ji(%n) = j(@n) > j(@) as n > ov, 
and so j1(x) = j(z). If x < 0 then 


Thus j is unique. 


Exercises 


3.3.1 Define the notion of a convergent sequence in an ordered field. Suppose 
that F’ is an ordered field in which each bounded increasing sequence 


3.3.2 
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converges. In this exercise we show that F’ has the supremum property, 


so that there exists a unique order-preserving field isomorphism of R 
onto F’, 


(e) 
(f) 


Show that each bounded decreasing sequence converges. 

Show that if « > 0 then there exists n such that 1/n < e. 
Suppose that A is a non-empty subset of F’ which is bounded 
above. Show that there exists n € N which is an upper bound for 
A. 

Suppose that A is a non-empty subset of Ft = {x € F : x > 0} 
which is bounded above. If k € N, let 


by = inf{j EN: 7 > 2¥a for each a € A}, 


and let cy = b,/2*. Show that (cx)?2, is a bounded decreasing 
sequence. 

Let c = limz_.., cy. Show that c = sup A. 

Show that F’ has the supremum property. 


This extended question provides an alternative construction of the real 


numbers. 


(a) 


A rational Cauchy sequence is a sequence r = (rp,)P2, in Q such 
that for each 7 € N there exists n; such that |rm —Tn| < 1/j for 
m,n > nj;. Ifr and s are rational Cauchy sequences, set r < s if 
Tm < Sn for all n € N. Show that < is a partial order on the set C 
of rational Cauchy sequences. 

Define a relation r ~ s on the set C by setting r ~ s if the sequence 
(71, 81,72, $2,--.) is a rational Cauchy sequence. Show that ~ is an 
equivalence relation on C. 

Let D = C/ ~ be the set of equivalence classes in C. Define a 
relation < on D by setting a < 0b if there exist r € a and s € b 
such that r < s. Show that this defines a total order on D. Show 
that D does not have a greatest or least element. 

If r € Q, let r = (rpn)e21, where r, = r for all n € N, and let 
j(r) = [r] be its equivalence class in D. Show that j is an injective 
order-preserving mapping of Q into D. 

Define addition and multiplication of elements of D, and show that 
D is an ordered field. 

Show that a bounded increasing sequence in D converges. (This 
is the hardest part. Choose representatives, and use a diagonal 
argument to find the limit.) 
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3.4 The Bolzano—Weierstrass theorem 


Before reading this section, it is advisable to read Section 2.4, possibly 
excluding Ramsey’s theorem (Theorem 2.4.4). You need to understand the 
diagonal procedure (Theorem 2.4.2) and Theorem 2.4.3. 

Sequences can behave in many different ways: consider for example a 
sequence (qp,)°2_, which maps N onto the set of rational numbers between 0 
and 1. The next theorem is therefore remarkable and is of great theoretical 
importance. 


Theorem 3.4.1 (The Bolzano—Weierstrass theorem) Suppose that 
(Gn) P29 is a bounded sequence of real numbers. Then there is a subsequence 
(Gn, )p41 which converges. 


Proof We shall give two proofs here, and a third proof in the next 
section. (It is always worth giving more than one proof of important results; 
each proof can throw a different light on the result, and the ideas from a 
proof can often be used to prove other results.) Each of the proofs uses 
Theorem 3.2.4. 

The first proof is very short. (an)?2.4 has a monotone subsequence (Corol- 
lary 2.4.3). This subsequence is bounded, and so it converges (Theorem 
3.2.4). 

The second proof, which is essentially the proof that Weierstrass gave, 
uses repeated subdivision, and a diagonal argument. Let us introduce some 
notation. If b,c € R and b < c, then the closed interval [b,c] is the set 
{xe R:b< 2 < ct}. It has length c — b; it is closed because it contains its 
endpoints b and c. We shall discuss these notions further in Section 4.1. 

Since (an)?2.9 is bounded, there exist bo, co with bo < co such that an € 
[bo, co], for all n. Let do = (bo + co)/2 be the midpoint of the closed interval. 
Then there are two possibilities. First, there are infinitely many n for which 
An, € [bo, do]; in this case we set b; = bp and c; = do, so that [b1, ci] = [bo, do]. 
Secondly, there are only finitely many n for which a, € [bo, do]; in this case, 
Gn € [do, co] for infinitely many n, and we set bj = do and c, = co, so that 
(b1,c1] = [do, co]. Thus in either case Ay = {n € N: ay € [b1,c;]} is an 
infinite subset of N. We have an infinite set of terms in an closed interval of 
half the length of the original closed interval. 

We now iterate this procedure recursively. At the jth step, we obtain a 
closed interval [b;, cj] such that [b;, cj] C [bj-1, c;-1] and c;—b; = (co—bo) /2’, 
and such that Aj; = {n € N: ay € [b;,c;]} is an infinite subset of Aj_1. 
For each j € N let (ni?)2, be the standard enumeration of A,;. Let 
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mj = nl), so that bj < am, < cj, for 7 € N. By the diagonal proce- 
dure (Theorem 2.4.2) the sequence (m,;)7°, is a subsequence of N. The 


[o,e) 


sequence (05) F20 is increasing, and is bounded above by cg, and so it con- 


verges as j — ov, to b, say. Since c; = bj; + (co — bo)/2/, cj converges to b as 
j— oo, as well. Since bj < am, < Cj, Gm; — b as 7 — ov, by the sandwich 


principle. 


Exercises 


3.4.1 Let (qn)°@, be a sequence which maps N onto the set of rational 
numbers between 0 and 1. Show that if / € [0,1] then there exists a 
subsequence (qn,,)721 which converges to I. 

3.4.2 Suppose that (a,)°°.9 is a bounded sequence with the property that 
there exists | such that if (an, am is any convergent subsequence of 
(@n)°29 then its limit is ]. Show that a, — | as n > oo. 

3.4.3 Suppose that (a,)°2, is a bounded sequence of real numbers which 
does not converge. Show that (a,,)?2_) has two convergent subsequences 


which converge to different limits. 


3.5 Upper and lower limits 


Suppose that (a,)7°.9 is a bounded sequence, and that (an,)?29 is a con- 
vergent subsequence, convergent to 1, say. What can we say about 1? 
First, we can say that 1 € [mo,Mo] where mo = inf{a, :n € Zt} and 
Mo = sup{an : n € Zt}. But it may happen, for example, that ao is much 
larger than all the other terms in the sequence. Then a9 = Mo, and Mo 
does not give us much information about J. Indeed, the value of | is not 
constrained in any way by any finite set of values of an. 

Let us therefore set M; = sup{a, :n € Zt,n > j}. Then the sequence 
(M;)7<, is decreasing (we take suprema over smaller and smaller sets), and 
is bounded below by mpg. It therefore converges to a limit as 7 — oo. This 
limit is called the upper limit or limes superior of the sequence (an)?°.9 and 
is denoted by limsup,_,.,(@n). In exactly the same way, we define mj; = 
inf{a,:nEN,n> Jj}; (mj) Foy is increasing and bounded above by Mop and 
converges to the lower limit or limes inferior lim inf,..0(an) of the sequence 
(Gn )p-1< Since ms <M; for all 7, liminf, so4(G,) < limsupy..( Gn): 

Upper and lower limits are a little complicated, being defined in two 
stages; first we consider a sequence of suprema or infima, and secondly 
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we take the limit of these sequences. Such a procedure will recur else- 
where! We can characterize upper and lower limits by their fundamental 
properties. 


Theorem 3.5.1 Suppose that (an)? is a bounded sequence, and that S = 
lim sup, ..,(@n). Then S' is the unique real number with the two following 
properties: 

(i) ift > S then there exists no such that an <t for all n > no; 

(i) ifr <S andne€N then there exists m € N with m > n such that 
Am >T. 

There is a similar characterization of liminfn—oo(an)- 


We can express (i) by saying that a, is eventually less than t, and (ii) by 
saying that a, is frequently greater than r. 


Proof First, we show that S' satisfies (i) and (ii). 

(i) Since S = inf{M,; : 7 € N} and t > S, t is not a lower bound for 
{M; : j © N}. Thus there exists no such that M,, < t: then an < Mn, < t 
for n > ng. 

(ii) Since r < S < M, and M,, = sup{am :m > n}, r is not an upper 
bound for {a,;, :m > n}. Thus there exists m >n with ay > r. 

We now turn to uniqueness. Suppose that T > S. Let U = (S + T)/2, so 
that T > U > S. By (i), there exists no such that a, < U for all n > no, 
and so (ii) does not hold for T. 

Suppose that R < S. Let Q = (S+ R)/2, so that R< Q < S. By (ii), if 
n € N there exists m > n with adm > Q, and so (i) does not hold for R. 


We can now answer the question that was raised at the beginning of the 
section, and give a third proof of the Bolzano—Weierstrass theorem. 


Theorem 3.5.2 Suppose that (an, aan is a convergent subsequence of a 
bounded sequence (dn)°2.9. Then 


lim inf (an) < lim an, < lim sup (an). 
n—0o joo n—-00 


Further, there exist subsequences (ay,)7-9 and (am, )F29 such that 


aj, > lim inf (an) and am, — lim sup (an) as j — oo. 
ee. n—0o 
Proof Since mn, < an, < Mn,, and since mn, — liminfp—(an) and 
My, — limsup,,_,..(an), the first result follows from Theorem 3.2.5 (vii). 
Let S = limsup,_,..(@n). By Theorem 3.5.1 (i), there exists a least no 
such that a, < $+ 1 for all n > no, and by (ii) there exists a least po > no 
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such that ap, > S —1. Continuing recursively, there exists a least nj; > pj—1 
such that an < S+1/j for n > nj, and there exists a least p; > n; such 
that ap, > S—1/j. Then S —1/j < ay, < S+1/J, so that ay, > S as 
j — ©, by the sandwich principle. An exactly similar proof works for the 


lower limit. 


What happens when the upper and lower limits are equal? 


Theorem 3.5.3 Suppose that (an)729 is a bounded sequence and that 
lL e€ R. Then a, — | as n — o if and only if limsup,_.,,(@n) = 
limit yas (a) the 


Proof If limsup,_...(@n) = liminfn—oo(a@n) = 1, then m; — | and M; — ! 
as j — oo. Since m; < a; < Mj, a; — 1, by the sandwich principle. 
Conversely, suppose that a, — | as n — oo. Then if € > 0 there exists 
no such that 1 — €/2 < an <1l+€e/2 for n > no. Thusl—e < Mn <l+e 
for n > no, so that M,, — 1 as n > oo. Thus limsup,,_,,,(@n) = J. Similarly 


lina ant yd es.) Sa 


What do we do when (an)?°9 is not bounded? If (an)?2.9 is not bounded 
above, then M; = ov for all j € Z*; we therefore define lim sup,,_,., @n to be 
+oo. If (an )?29 is bounded above, but is not bounded below, then (;)?°¢ is 
a decreasing sequence. If this is bounded below, we define limsup,,_,,, dn = 
limp—oo Mn; if not, we set limsup,,_,,, Gn to be —oo. We treat liminfy. Gn 
in a similar way. 


Exercises 
3.5.1 Consider the second proof of the Bolzano—Weierstrass theorem in the 
preceding section. What is lim;—.. @m,? 
3.5.2 Suppose that (an)? is a bounded sequence. Let s, = ag +-:-+ Gn. 


Show that 
‘ f : Z Sn ‘ Sn é 
lim inf a, < lim inf < lim sup < lim sup ap. 
TOO noon +1 noo N— 00 


Deduce that if a, — 1 as n > co then s,/(n+1) > las n— oo. Give 
an example to show that the converse does not hold. 
3.5.3 Suppose that (a,)?2.5 is a bounded sequence. Let 


U={rER: {ne Zt : an > cz} is finite}. 


Show that inf U = limsup,,_,., an. 
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3.5.4 Suppose that (an)?2.9 and (bn )?29 are bounded sequences. Show that 
lim inf a, + lim inf bp < lim inf (an + bn) < lim inf a, + lim sup bp 
n—-0o n—-0o n—0o N—-0o n—-0o 
< lim sup (ay + b,) < lim sup a, + lim sup bp. 
N— Co n—- Co Nn— Co 
Show that equality holds in the last inequality if and only if there 
exists a strictly increasing sequence (nj) Feo EN such that 


Gn, — lim sup ay and bp, — lim sup by as 7 — oo. 
noo noo 
Give an example where all the inequalities are strict. 

3.5.5 Suppose that (a,)?2.9 and (bp)°29 are sequences of positive numbers, 
and that a, — aasn — oo. Show that ifa > 0 then liminfy,_,.5 anbn = 
alim infy_.99 bj. Show that equality need not hold if a = 0. 

3.5.6 Suppose that (s;,)°29 is a sequence of real numbers and that (tp)? 
is a strictly increasing unbounded sequence of positive numbers. Show 
that lim sup,,.o5(Sn/tn) < limsup,,_..9((Sn4+1 — 8n)/(tn41 — tn)). 

3.5.7 Suppose that (an)°2.9 is a sequence of positive numbers. Show that 


lim sup,,_.56 a < limsup(an41/@n). 


3.6 The general principle of convergence 


Suppose that (a,,)°°.9 is a sequence of real numbers. How can we tell whether 
it converges or not? If we suspect that its limit is 1, we can consider the 
behaviour of |a,,—1| as n becomes large. But what if we do not know what | 
should be? We have seen that we can answer this question when (a,,)°29 is 
monotonic (Theorem 3.2.4), but this only happens in special circumstances. 
Here we provide a more general answer. 

A sequence (dn) is a Cauchy sequence if whenever ¢ > 0 there exists 
no (usually dependent on ¢€) such that |am — an| < € for m,n > no. The 
terms of the sequence become close as m and n become large. 


Proposition 3.6.1 A Cauchy sequence (ay)°29 is bounded. 


Proof The proof is just like the proof of Proposition 3.2.3. There exists 
no such that |am — an| <1 for m,n > no. Let 


M= max{|ao|, ail, sey lano|, lang | tr 1s 


If n > no then |an| < |an — an,| + |an,| < |an,| +1, so that |a,| < M for 
all n. 


38.7 Complex numbers 99 


Theorem 3.6.2 (The general principle of convergence) A sequence 
(Gn) °9 of real numbers is convergent if and only if it is a Cauchy sequence. 


Proof First, suppose that a, — | as n — oo. Given € > 0 there exists no 
such that ja, — 1| < €/2 for n > ng. If m,n > no then 


lam — An| < |@m — 1] + lan — | < €/2 + €/2 =e, 


so that (an)P29 is a Cauchy sequence. 

Conversely, suppose that (an)°2q is a Cauchy sequence. Then it is 
bounded, and so, by the Bolzano—Weierstrass theorem, it has a convergent 
subsequence (dn, )?29, convergent to 1 say. We shall show that a, — / as 
n — co. Suppose that ¢ > 0. Then there exist ko such that |an, — 1] < €/2 
for k > ko and N such that |am — an| < €/2 for m,n > N. There exists 
ky > ko such that nz, > N. Ifn > N then 


lan — 1] < |an — an,,| + lan, — I] < €/2 + €/2 =e. 


A Cauchy sequence is a sequence that looks as if it should converge. 
A Cauchy sequence of rational numbers need not converge to a rational 
number (consider a sequence of rational numbers converging to V2), but 
does converge to a real number. This indicates again that the real numbers 
provide a good extension of the rational numbers. 


Exercises 


3.6.1 Show from the definitions that the upper and lower limits of a Cauchy 
sequence are equal. Use this to give another proof of the general 
principle of convergence. 

3.6.2 Let a, = \/n. Show that if « > 0 then there exists no such that 
lQ@n41 — Gn| < € for n > no. Is (an)? a Cauchy sequence? 


3.7 Complex numbers 


This volume is principally concerned with real analysis: the study of real- 
valued functions of a real variable, and sequences of real numbers. There 
are however topics, such as the theory of power series, where it is natu- 
ral to consider complex-valued functions of a complex variable. This topic 
will be considered in much more detail in Volume III, but here, and in 
the next section, we introduce some of the basic properties of complex 
numbers. 
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Why do we need to consider complex numbers? Although the construction 
of the real numbers allows us to find roots of the polynomial x? — 2, there 
are plenty of polynomials with no real roots. For example, if a € R then 
a? > 0, so that a?+1 > 1, and so the polynomial 2?+1 has no real roots. We 
overcome this by enlarging the real field R to obtain the complex field C. 

This is a problem of algebra, rather than analysis. We want to adjoin an 
element i to R. with the property that i? = —1. We shall describe a simple 
way of doing this; Exercises 3.7.1 and 3.7.2 provide other constructions. In 
each case, we are concerned with vector spaces. Suppose that K is a field. A 
vector space E over K is an abelian additive group (E£, +), with zero element 
0, together with a mapping (scalar multiplication) (A, x) > Ax of K x E into 
E which satisfies 


elav=z, 
e (Atma = Art pe, 
© A(ur) = (Ap)a, 


e A(a+y) =Ax4+Ay, 


for A, € K and x,y € E. The elements of F are called vectors and the 
elements of K are called scalars. A vector space over R is called a real vector 
space. 

It then follows that 0.2 = 0 and \.0 = 0 for  € E and \ € K. [Note that 
we use the same symbol 0 for the additive identity element in E (the zero 
vector) and the zero element (the zero scalar in K).] We denote E \ {0} 
by E*. 

Besides the element 7, we want to consider elements bi, where b € R, and 
elements a + bi, where a,b € R. We therefore take R? = {(z,y): 2,y € R} 
as our underlying set. R? is a real vector space: 


(21, yr) + (x2, yo) = (a1 + V2, y1 + y2) and a(x, y) = (az, ay) fora eR. 


We set 1 = (1,0) and i = (0,1), so that any element (a, y) € R? can be 
written as (2, y) = 71+ yi. We want to define an associative and distributive 
multiplication in such a way that 


V7=1,1i=i1=i and #=-1. 
Thus we require that 


(a1 + yit) (x1 + yt) = £101.11 + ry yol.t + yy Xot.1 + yr yot.t 
= (142 — yry2)1 + (x1y2 + yi %2)4, 
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and so we define multiplication by setting 
(711+ yit)(x21 + yot) = (12 — yry2)1 + (@1y2 + yix2)4. 


We denote R?, with this multiplication, by C. Note that if a,b € R then 
al + b1 = (a+ 5)1 and (a1)(b1) = abl, so that if we identify R with 
R.1 = {(a,0) : a € R}, then the addition and multiplication on C extends 
the addition and multiplication on R. 
We need to verify that this multiplication is commutative (this is clear 
from the definition) and associative. We verify associativity directly: 
[(v11 + y12)(v21 + yot)|(x31 + y3t) 
= [(w122 — yry2)1 + (x1y2 + yiv2)4](a31 + y3t) 
= ((1%2 — yry2)a3 — (w1y2 + yrv2)y3)1 
+((x1%2 — yiy2)y3 + (e1y2 + yi1v2)x3)% 
= (11(@2%3 — yoys) — yi(@2y3 + yers))1 
+(a1(x2y3 + yor3) + yi (203 — yoys))e 
= (x11 + yit)[(r273 — yoy3)1 + (xoy3 + yor)? 
= (x11 + yit)[(21 + yot)(v31 + y3t)]. 


It is equally straightforward to verify the distributive law: 
(211 + yrt)[(v21 + yot) + (x31 + yi) 
= [(v11 + yit)(w21 + yot)] + [(v11 + y1t) (x31 + y3t)]. 
If z=21+ yi 40 then x? + y? 4 0. We define 


gl= a 1 y 4 
x2 + y? ge + y? 


and then zz~! = z~!z = 1, so that z~! is the multiplicative inverse of z; 
it is also written as 1/z. Thus C is a field, the complex number field, which 
has a subfield R1 isomorphic to R. 

If z = x1 + yt, we define its (complex) conjugate Z to be Z = x1 — yi. 


Theorem 3.7.1 If z € C then Z = z. The mapping z — Z is a field 
isomorphism of C onto itself: 1=1, and if z,w € C thenz+w=Z74+ 
2W = Zw and 1/z=1/z. Ifz=a1+yi then zz = (27+ y’)1. 


Proof Easy direct verification. 


102 Convergent sequences 


We now write 1 for 1, and 7 for 7, so that (x,y) = 271+ yi=ax+4+1y. x is 
the real part of z, denoted by tz, and y is the imaginary part, denoted by 
Sz. If y= 0 then z is real, and if x = 0 then z is pure imaginary. 

We have therefore embedded the real number field R in a larger field C, in 
which the polynomial x? +1 has two roots, i and —i, and we can factorize the 
polynomial x? +1 as (x —i)(2 +i). The construction is straightforward, but 
the step is enormous. As we shall see, the real numbers, and real analysis, are 
fascinating. By comparison, the complex numbers, and complex analysis, are 
magical. Let us state one result to illustrate this. If p(a) = anz"+---+a9 isa 
complex polynomial, with n > 0 and a, # 0, then p has a root in C, and we 
can express p as a product of linear factors: p(x) = an(a—c1)...(@—Cn). We 
have extended the field to deal with one very simple quadratic polynomial, 
and the resulting extension is powerful enough to handle all polynomials. 

We set |z| = (x?+-y?)!/?, so that |z|? = zz. The quantity |z| is the modulus, 
or absolute value, of z; it measures the size of z. Note that |Z| = |z|. If x is 
real, its modulus as a real number is the same as its modulus as a complex 
number. If z 4 0, then |z| > 0, and 2~! = 2/|z|?. 

Note that +2 = 22, and z—Z = 2y, so that |z+2| < 2|z|, with equality 
if and only if z is real, and |z — 2| < 2|z|, with equality if and only if z is 
pure imaginary. Note also that 


|zw|? = (2w)(7w) = zz = \z|7|w]?, so that |zw| = |z||w]. 
Proposition 3.7.2 If 2, z2 € C, set d(z1, 22) = |z1 — z2|. Then 


d(z1, 22) = d(z2, 21); 
d(z, z2) = 0 if and only if 2 = 29; 
d(z, 23) < d(2, 22) + d(Z2, 23) (the triangle inequality). 


Proof The first two statements are obvious. For the third, let v = z, — za 
and w = zg — 23. Then we must show that |v + w| < |v| + |w]. Let t = vw, 
so that ¢ = Dw and |t| = |v|./w|. Then 


lv +wl? =(vtw)(v+w) =v0+t+i+wu 
< Jul? + e+ + wl? < ol? + 2le| + [w/? 
= |v|? + 2fo||w] + wl? = (l2] + [wl)?. 
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2i4 


Z+w 


Figure 3.7. The Argand diagram. 


Again, dis a metric on C, which extends the metric on R. We can consider 
a point (x,y) € R? as a point in the plane, with coordinates x and y. When 
we identify C with R?, the plane is called the complex plane or Argand 
diagram. 

If w =u+iv € C, the mapping z — z+ w is a represented by a shift, 
sending (2, y) to (x+u, y+v). The mapping z — 7 is represented by reflection 
in the real axis {(z,y) € R? : y = 0}. We shall consider the geometry of 
multiplication later, when we have established further properties of complex 
numbers. 

In Figure 3.7, we take z = 1+ (3/4)i and w = 5/12 +1. 

We end this section by listing some of the subsets of C of particular 
importance. 


e C*={zE€C:2z240}=C \ {0} is the punctured plane. 
e D={zEC: |z| < 1} is the open unit disc. 

e D={zEC: |z| <1} is the closed unit disc. 

e T={zEC: |z| =1} is the unit circle. 

e Hy, ={z=ax+ iy: y > 0} is the upper half-plane. 

e H ={z=a2x+ iy: y < 0} is the lower half-plane. 

e Hy, = {z=x+ iy: 2x > 0} is the right-hand half-plane. 
e Hy ={z=ax+iy: ax <0} is the left-hand half-plane. 
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Exercises 


3.7.1 Let R[x] denote the set of all real polynomials. Let N be the set of 
all elements of R[z] which are divisible by x7 +1: p(x) € N if we can 
write p(x) = (x7+1)q(x), with q(x) € R[z]. Define a relation on R[z] 
by setting p(x) ~ r(x) if p(a)—r(a) € N. Verify that this is an equiv- 
alence relation. Show that each equivalence class contains an element 
of degree at most 1. Define operations on the equivalence classes by 
setting [p(x)] +[r(x)] = [p(x)+4(2)], [p(@)].-[r(@)] = [p(@).r(a)]. Show 
that these definitions do not depend on the choice of representatives. 
Show that with these operations, the quotient space R|z]/ ~ becomes 
a field, isomorphic to C. 

3.7.2 If f and g are mappings from R? to R? and a,b € R, define 
the mapping af + bg : R? — R? by setting (af + bg)(z) = 
af(z) + bg(z) and define fg as fog. Let I((x,y)) = (x,y) and let 
J((xz,y)) = (-y, x). Show that with these laws of composition, the 
set of mappings {al + bJ : (a,b) € R?} becomes a field, isomorphic 
to C. 

3.7.3 Suppose that 9: C — C is a field isomorphism of C onto itself for 
which 6(x) = x for x real. Show that either 0(z) = z for all z € C or 
0(z) = for all z EC. 

3.7.4 Show that if x is a non-zero element of an ordered field then 
x? > 0. Show that there is no total ordering of C which makes it 

an ordered field. 

3.7.5 Suppose that z = x + iy, with y > 0. Show that there are positive 
real numbers u and v with 2u? = |z| +2 and 2v? = |z|— 2. Calculate 
(u + iv)?. Show that z has exactly two complex square roots. Show 
that the same holds when y < 0. 

3.7.6 Suppose that 2,22 € C. Show that |z, + z2|? + jz. — z|? = 
2(\z1|? + |z2|*) (the parallelogram law). Use induction to find a 
corresponding result for a finite set {21,...,2n} of complex numbers. 

3.7.7 Sketch the region {z € C: |z—1| < 1} in the Argand diagram. 

3.7.8 Sketch the region {z € C: |z| < 2|z — 3]} in the Argand diagram. 

3.7.9 Sketch the sets R(z?)=c and &(z?)=d, where c and d are real 
constants. 

3.7.10 A triple (a, b, c) of integers is called a Pythagorean triple if a?+b? = c?. 
Suppose that (a,b,c) and (m,n,p) are Pythagorean triples. Verify 
that (am — bn,an + bm,cp) is a Pythagorean triple, and interpret 
this in terms of complex multiplication. 
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3.8 The convergence of complex sequences 


In this volume, we concentrate almost exclusively on real analysis. In the 
next chapter, however, we consider infinite series, and, in particular, power 
series. Here it is appropriate to consider series with complex terms. In this 
section we consider the convergence of complex-valued sequences. 


The definitions are very straightforward generalizations of the definitions 


oo 
n=1 


It converges to a complex number z if whenever € > 0 there exists no € N 


in the real case. Suppose that (z,,) is a sequence of of complex numbers. 
such that |z, —z| < € for all n > no, and it is a Cauchy sequence if whenever 
€ > 0 there exists m9 € N such that |z, — zm| < € for all m,n > no. We 
write z, — z as n — oo if z, converges to z as n — oo. If zp, converges to 0 
as n — 00, we say that (zn)°°, is a null sequence. 

These definitions can be expressed in terms of real sequences. If z or zp, 
is a complex number and we write z= 7+ iy or Z, = Tn + iYn, then x and 
Zp are always the real parts and y and y, the imaginary parts of z and Zp, 


respectively. 
[oe) 


Proposition 3.8.1 Suppose that (zn)? = (an +tyn) 4 
C and that z=a+iyeEC. Then z, — z as n > o0 if and only if tp, — x 


iS a sequence in 


and Yn + y as n — oo. The sequence (zn)°°, is a Cauchy sequence if and 
only if each of the real sequences (%p)°°, and (yn)°, is a Cauchy sequence. 


Proof First suppose that z, — z as n — oo. Since |r, — 2| < [zn — 2| 
and |y — y| < |zn — z|, In 3 x and yn — y as n — oo. Conversely, since 
l2n — 2| < |tn — 2] + |yn — yl, it follows that if z, — x and yn — y as 
n — oo, then z, — z asn— co. The proof of the result concerning Cauchy 


sequences is essentially the same. 


These elementary results enable us the deduce the following results from 
the results of Section 3.1. A subset B of C is bounded if {|z| : z € B} is 
bounded in R. A sequence (Z7)?2., is bounded if the set of values {zp :n € 
Z*} is bounded. 


Theorem 3.8.2 Suppose that (zn)? and (wn)P1 are sequences in C. 


(i) If 2, > z asn— co and zn, > w as n— oo, then z=w. 
(it) If zn is a convergent sequence in C , then it is bounded. 
(itt) If zn, = z for all n, then z, > z asn— oo. 
(iv) If (Zn)? is a null sequence, and (Wn)? is bounded, then (Zn Wn)? 
is a null sequence. 
(v) If 24 4 2 and Wn > wasn — oo then zn + Wn 7 2+w asn— oo. 
(v) If 2, 2 z and wn > w as n — oO then znWn 3 Zw as n— co. 
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(vi) If 2. #0 and z # 0 and zn, — z as n > c then 1/zn — 1/2 as 
n— . 

(viii) If z, — z asn — oo and if (zn, )F29 is a subsequence, then zn, — Z 
as k — oo. 


Proof The reader should verify that these results follow from the results 
of Section 3.2 and Proposition 3.8.1. 


Similarly, we have the following results. 


Theorem 3.8.3 (The complex Bolzano—Weierstrass theorem) Suppose 
that (Zn)°2.9 is a bounded sequence of complex numbers. Then there is a 
subsequence (Zn,)p21 which converges. 


Proof By the real Bolzano—Weierstrass theorem, there exists a subse- 
quence (Zm,)72, such that the real subsequence (am,)j=1 — 0o converges, 
and there exists a subsequence (2Zp,,)?2., of that for which (yn, )?2., converges. 


Then (Zn, )?2, converges, by Proposition 3.8.1. 


Theorem 3.8.4 (The complex general principle of convergence) A 
sequence (Zn)°2.9 of compler numbers is convergent if and only if it is a 
Cauchy sequence. 


Proof This follows easily from Proposition 3.8.1. 


Example 3.8.5 Suppose that z € C. Let z, = z”. Then z, — 0 as 
n — oo if |z| < 1, z — 1 if z = 1. Otherwise, the sequence (zn)°°, does 
not converge. 


For |z, — 0| = |z|". If |z| < 1 then |z|" — 0 as n — ow, from which it 
follows that z, — 0 as n— oo. If z = 1 then z, = 1 for all n, so that z, > 1 
as n — oo. If |z| > 1 and z £1 then |Zn41 — zn| = |z”||z — 1] = |z — 1] for 


alln € N, so that (z,)?2, is not a Cauchy sequence, and therefore does not 
converge. 


Exercise 


3.8.1 Suppose that (zp,)°2., is a sequence in C which converges to z. Show 
that Z, — Z and |zn| > |z| as n — oo. 
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4.1 Infinite series 


The notion of convergence of a sequence allows us to consider infinite sums, 
or series. Once again, we take either N or Z* as index set. We shall generally 
consider the case where the terms of the series are complex-valued; since RC 
C, the results will also apply to the case where all the terms are real-valued. 
Suppose that (25) Fo is a sequence of complex numbers. We set 


n 
Sn = y 2j = Zt + Zn, 
j=0 


where s,, is the nth partial sum. If s, — s as n — oo, we say that the infinite 
sum, or infinite series, yoo 2; converges to s. If s, does not converge, then 
we say that ee 2; diverges. 

Here are two easy examples: as we shall see, the first one is particularly 
useful. Suppose that |z| <1. Let 2; = 24 for j € Zt. Then 


(T= 2) eh ee ee eg ie ee et oS are, 


so that 
J — gntl 1 gntl 1 
i ees ae Toy and sn ‘7-7 ee > OO. 
Thus 7729 z=1/(1—2). 


Secondly, let 


1 1 
aj; = = — forj7 EN 
gael) os pop 
Then 
n 
1 1-4 1 1 1 
= = 1 { | — : 
sn = Dy ( 5) (5 5) + € =) n+1 
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so that sn > 1: Py 1/7 +1) =1. 

We can apply the results that we have obtained about convergent sequences 
to infinite series. For example, a complex series is convergent if and only if 
the sum of the real parts and the sum of the imaginary parts of the terms 
both converge. 


Proposition 4.1.1 Suppose that z; =x; + 1y; and that s=o+ ir. Then 
jo Wj converges to s if and only if \ 1s 9.x; converges to o and YY? 4 yj 
converges to T. 


The following result follows immediately from Theorems 3.8.2 and 3.2.5. 


Theorem 4.1.2 Suppose that co 2; converges to s and that Lo Wj 
converges to t. 
(i) When it exists, the sum is unique: if )0}-9 2) = 8', then ss =s'. 
(it) D2 o(2; + wy) converges to s +t. 
(ii) Ifc eC then Y1° 9 czy converges to cs. 
(iv) If z; and w; are real, and z; < w; for all j, then s <t. 


Suppose that (j,)?29 is a strictly increasing sequence in Zt. Set bo = 
0 zj, and set by = Saree z; for k > 0. Then the sequence (bp)? 
is called a block sequence, or bracketed sequence, derived from (a5) F20- The 


following result then follows immediately from Theorem 3.2.5 (viii). 


Proposition 4.1.3 /f ya z; converges to s and (by)p2, 18 a block 
sequence derived from it, then S°7° , by, converges to s. 


The converse is false in general (but see Corollary 4.2.3 below). Let z; = 
(—1)3, for 7 € Nt. Then so, = 0 and soni1 = 1 for all n € Zt, so that 
LO z; diverges. If we set j, = 2k +1, then by = Zo, + Zox41 = 0 for 
k € N*, so that S7?°., by converges to 0. 

We also have the following simple result. 


Proposition 4.1.4 Jf ye =0 2; converges, then z; — 0 as j — ov. 


Proof Suppose that the sum is s. Then s; — s and s;_1 — s as j — oo, so 
that 2; = 8; — sj_1 > 0 as j — ov. 


The general principle of convergence takes the following form. 
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Theorem 4.1.5 (The general principle of convergence) Suppose that 
(25) F20 is a sequence of complex numbers. Then Lo 2; converges if and 
only if given € > 0 there exists no such that |Sn — 8m| = | Vim41 zi] < € 
forn>m-> no. 


Proof This follows immediately from the corresponding result for sequences. 


Exercises 


4.1.1 Show that if |z| < 1 then the series jo + 1)z/ converges, and find 
its sum. 

4.1.2 Simplify 1/(1 — z) — z/(1 — 2”). Hence or otherwise show that if z? 4 1 
then 37°, 22"/(1 — 22""") converges, and find its sum. 

4.1.3 Simplify z/(1 — z) — z/(1+ z). Hence show that if |z| < 1 then 


z a a 
l4+z 1422 1424 1428 


converges, and find its sum. 


4.2 Series with non-negative terms 


Series with real non-negative terms behave particularly well. Theorem 3.2.4 
has the following immediate consequence. 


Theorem 4.2.1 Suppose that (aj) Ro is a sequence of non-negative real 
numbers, and that 8, = Y)'_, aj. Then (8n)729 ts an increasing sequence. 
Either (8n)°29 is bounded, in which case ea a; converges to SUP, Sn, OT 
Sn — OO, in which case we say that jo a; diverges to +00, and write 
ye j=0 4) = +00. 


This theorem indicates that summing a series of non-negative terms is 
reasonably straightforward. Here are some of its consequences; the first is 
one of many tests for convergence. 


Corollary 4.2.2 (The comparison test) If 0 <c; < a; for all j > jo and 
ye j0 aj converges then SYS, cj converges, and YY 9c; < ih 9 Gj. 

For example, °°, 1/3? converges, since 1/j? < 2/j(j + 1). Note that 
this corollary does not tell us what the sum is, although we can deduce from 
Theorem 4.1.2 that it is at most 2. (In fact the sum is 77/6; we shall prove 
this much later!) 
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Corollary 4.2.3 If (a;)F20 is a sequence of non-negative numbers and 
(bp) R29 18 @ block sequence derived from it, then eA a; converges to s if 
and only if S71 bh converges to s. 


Proof 8, — s as n — oo if and only if s;, — s asl — oo, and s;, = 


Sy bi. 


: ae a . : 
We can say more when (a;) joo is a decreasing sequence of non-negative 
numbers. 


Corollary 4.2.4 (The compression principle) If (aj)7&, ts a decreasing 
i love) 2 

sequence of non-negative real numbers, then i a; converges if and only 

if 9 2*agx converges. If so then 


Proof Let (bp)?29 be the block sequence obtained by taking j, = 2". Then 


1 
52 ah — OF tne < by = Agk-14y +2 + + Age < OF pa: 


9k-1 


since (a;)729 is decreasing, and there are summands. Thus the conver- 


gence result follows from two applications of the comparison test and the 


inequalities from Theorem 4.1.2 (iv). 


Corollary 4.2.5 The harmonic series eel 1/j diverges to +c0. 


Proof For if aj = 1/j then 2 ag. = 1, so that the result follows from the 
preceding corollary. 


Corollary 4.2.6 (Cauchy’s test) Suppose that (a;)?29 is a bounded 
sequence of non-negative real numbers. If lim sup;_,. a <1 then =) a; 


converges, and if limsup,;_,o0 an? > 1 then oi aj; = +00. 


Proof In the first case, choose r such that lim sup,_,.. ay! J <r <1. Then 


there exists jo such that a? <r for j > jo. Thus aj < r) for j > jo and so, 
using the comparison test, SpA a; converges. 

In the second case, for each 7 € Z* there exists k > 7 such that at! es I, 
so that a, > 1. Thus (a;)729 is not a null sequence and so ei a; diverges 


to +co. 
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Corollary 4.2.7 (D’Alembert’s ratio test) Suppose that (aj)2o is a 
sequence of positive real numbers. If lim supj_,oo aj41/aj <1 then ea aj 
converges. If lim inf joo aj41/a; > 1 then eel a; diverges to +00. 


Proof In the first case, choose r such that limsup,;_,., aj41/aj <7 < 1. 
Then there exists jo such that aj41/a; <r for j > jo. Thus if 7 > jo then 


i= J pet be Pe pede zi aj, <I aj, = (ajr?)r4, 
aj-1/ \aj-2 Ajg 


and so, taking the terms aj,...,a 


jo into account, there exists M such that 
aj <M r) for all j7. By the comparison test, x4 aj converges. 

In the second case, there exists 7; such that a;,, > a; for 7 > ji, so that 
a; > a;,for 7 > j,. Thus (aj) 20 is not a null sequence, so that by Proposition 


4.1.4, )02, a; diverges to +00 . 


It is important to note that neither corollary gives any information when 
lim sup;_,o @ gS 
the sum diverges, and when a; = 1/ 77, the sum converges. In either case, 

1/j 
a. 


= 1 or when limsup,_,,, aj/aj41 = 1. When aj = 1/J, 


— land aj41/a; > 1 as j > oo. 

We use D’Alembert’s ratio test to introduce the exponential function, 
one of the most important functions in analysis. Suppose that « > 0. Let 
aj = w/j!. Then aj41/a; = 2/(j7 + 1) and x/(j +1) > 0 as j — ov, 
so that er x) /j! converges, to exp(x), say. The mapping x — exp(z) is 
the exponential function. We set e = exp(1) = )7j2o 1/j!. Note that since 
1/n! < 1/2"~!, it follows that 


1 
Aseslt, 3-1 ~* 


In fact, e = 2.718281828.... We shall extend this definition for negative x in 
the next section and for complex x in Section 4.7. 

Let us give an example, relating to the argument of Theorem 3.3.1. Suppose 
that x is a positive real number. As in Theorem 3.3.1, we set r, = |2"2x| /2”. 
Let a9 = Xo = |x| and let a, = 2"(tn — Zn_1) for n € N. Then a, = 0 or 1, 
and 


Thus x = S77? 9(an/2”). We can write this as 


i= 09° Qj a9.... 
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This is the binary expansion of x. Note that, with this procedure, recurrent 
1s are avoided. 

We can of course also consider expansions with bases other than 2. We can 
for example write u = uo + ey u,;/10/ where 0 < uj; < 9, to obtain the 
familiar decimal expansion of u, and we can write v = v9 +) +°~_, v;/3/, where 
vj = 0,1 or 2; this is the ternary expansion of v. There are other possibilities: 
for example, we can write w = wo + jee w,;/j!, where 0 < w; < j. 

We can use these ideas to show that R is uncountable. 


Theorem 4.2.8 The set R of real numbers is uncountable. 


Proof We give two proofs. The first was given by Cantor in 1891. It is 
enough to show that [0,1) = {c € R: 0 < x < 1} is uncountable. Suppose 
that (%,)P2, is a sequence in [0, 1). We show that there exists y € [0, 1) which 
does not occur in the sequence, so that there can be no surjective mapping of 
N onto [0,1). Let a, = 0.%p1%n2... be the decimal expansion of x,,. We set 
Yn = Oif tan #0, and yn = 2 if Inn = 0. The sum )>°° | yn /10” converges, 
to y, say. From the construction, |x, — y| > 1/10", and so y ¥ xp, for any n. 

For the second proof, we define an injective map c from P(N) into [0, 1]; 
since P(N) is uncountable, so is [0,1]. This time, let us use ternary expan- 
sions. Suppose that A C N. Let a; = 2 if 7 € A, and let aj = O if 7 ¢ A. 
Then eet a;/3) converges, to c(A), say. Suppose that A # B, and that 
k is the least integer in exactly one of A and B. Then |c(A) — c(B)| => 
2/3* — 92.4, 2/3/ = 1/3*, and so c(A) # c(B). Thus the mapping 
C: A — c(A) : P(N) — [0,1] is injective. We shall meet this function 
again later. 


Cantor’s result, first proved by him in 1873, was very controversial. We 
know that the rationals are countable, and so there are ‘many more’ irra- 
tionals than rationals. We can say more. A real number z is algebraic if there 
exists a non-zero polynomial p with rational coefficients such that x is a root 
of p; otherwise it is transcendental. For example, radicals (numbers of the 
form k!/") are algebraic. So are the three real roots of the quintic x — 4a +2, 
although, following the results of Ruffini and Abel, these roots cannot be 
expressed in term of radicals. It can be hard to decide whether a particular 
number is algebraic or transcendental, and it was only in 1844 that Liouville 
first showed that any transcendental number existed. In 1851 he gave the first 
explicit example, showing that the number }7°°., 1/10" is trancendental. It 
is easy to see that e = >> 9 1/n! is not rational. If e = p/q, then q!e must 
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be an integer; but 


le = (qd! ! 1/91 ah 1 i Sosa ) 

gle = (q!+q!+q!/2!+ t+ (S47 oo 4 . 

The first term is an integer, and the second is less than 1, giving a contradic- 
tion. It is much harder to determine whether e is algebraic or transcendental, 
and it was only in 1873 (the same year as the first proof of Cantor’s theorem) 
that Hermite showed that e is transcendental, whereas the transcendence of 
am was only established by Lindemann nine years later, in 1882. But the set 
of algebraic numbers is countable (Exercise 4.2.15), and so there are ‘many 
more’ transcendental numbers than algebraic ones! One valid objection to 
this argument is that it is non-constructive; it does not give a method for pro- 
ducing transcendental numbers. It is however the case that many important 
results of analysis have this non-constructive property. 


Exercises 


4.2.1 Which of the following series converge, and which diverge? 


4.2.2 Suppose that 0 < a, < 1 for n € N. Show that if }°°°., an con- 
verges, then so do 7°°, a? and S>°°, an/(1 — an). Are the converse 
statements true? 

4.2.3 Suppose that 0 < a < 6. Show that 

l+a_ $ (1+a)(1+ 2a) 


1 = 
*T+b* (+5) +28) * 


converges. 

4.2.4 Suppose that (aj) Zo is a sequence of non-negative real numbers for 
which iLO aj; converges. Show that there is a sequence (mj)72o of 
positive numbers such that m; — oo as j — oo and Lo mja; 
converges. 

4.2.5 Suppose that (a;)72o is a sequence of non-negative real numbers for 
which }77°9 a; diverges to +oo. Show that there is a null sequence 
(mj)? of positive numbers such that }°7°9 mjaj diverges to +00. 

4.2.6 Suppose that « > 0. Use the binomial theorem to show that 


n 


(2) = oS >(14+2/n)”. 


j=0 *" 
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Recall (Exercise 3.1.9) that (1 + 2/m)™ is an increasing bounded 
sequence, which tends to a limit. Show that 

lim (1+2/m)”™ > en(x). 

m—- Co 
Show that (1 + x/m)™ — exp(x) as m > co. 

4.2.7 Suppose that (aj) Fey is a decreasing sequence of non-negative real 
numbers. Show that ean a; converges if and only if )°?° 5 3¥ age 
converges, and if and only if }7?°.9 kax2 converges. 

4.2.8 Suppose that (aj)ey is a decreasing sequence of non-negative real 
numbers for which yt a; converges. Show that na, — 0 asn — oo. 

4.2.9 Simplify 1 — a/(1 + a). Suppose that a; > 0 for 7 € N. Show that 
pan a;/(1 + a1)(1 + ag)...(1 + a;) converges, to s say, where 0 < 
s <1. Determine s when >?) aj = +00. 

4.2.10 Suppose that (aj)72o and (b;)j2o are sequences of positive real num- 
bers, and that there exists jp such that aj;41/aj; < bj41/b; for j > jo. 
Show that if }75°0 b; converges, then so does )77°9 aj. 

4.2.11 Suppose that (a;)9°2, is a sequence of non-negative real numbers for 
which jet a;/j converges. Show that Oa a;)/n + 0 as n — co 
(Kronecker’s Lemma). 

4.2.12 The following tests, due to Kummer and Dini, extend D’Alembert’s 
ratio test. Suppose that (a;)729 and (c;)729 are sequences of positive 
real numbers. 

Show that if 
das (faut - “i) <0 
Joo aj 
then )°°, a; converges. 
Show that if )79°(1/c;) diverges to +00 and 


Cj41Qj; 

F i gt14j+1 : 

lim inf (fa —cj} >0 
IPOD j 


then }°5= a; diverges to +00. 

4.2.13 As a special case of the tests of the previous exercise, suppose that 
(a;)72o is a sequences of positive numbers for which a;41/aj; — 1, so 
that D’Alembert’s test gives no information. 

Show that if 
tae eno), 
joo aj 


<0 


then 5°, a; converges. 
Ya ee) 
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Show that if 
lim inf Hlaj+1 = 95) =4)) 
joo aj 


>0 


then 772 aj diverges to +00. 
4.2.14 Suppose that 0 < a < b. Show that 


l+a, (l+a)(2+a) 


ies eB ane) 


converges if b > a+ 1 and diverges if b < a+ 1. What happens if 
bH=e-l 

4.2.15 Show that the set of polynomials of degree d with rational coeffi- 
cients is countable. Show that the set of all polynomials with rational 
coefficients is countable. Show that the set of algebraic numbers is 
countable. 


4.3 Absolute and conditional convergence 
A series LO z; is said to converge absolutely if eo |z;| converges. 


Proposition 4.3.1 Jf Lo 2; converges absolutely then it converges, and 
(oe) (oe) 

| yj=0 |< 2 j=0 |aj|. 
Proof If zj = xj + ty; then |xj| < |zj| and |yj| < |z,|, so that )°29 2; 
converges absolutely if and only if $°% j=0 23 and ye FL0 y; do; it is enough 
to consider series with real terms. Suppose that a) a; is an absolutely 
convergent real series. Let 

a; = a; if aj > 0 and aj = Oifa; <0, 

a; =Oifa; > 0 anda; = —a; =|a,| if aj < 0. 
Since we 9 |@;| converges, each of the series 3s i ay and ae 0 4; converges. 
Since a; = ay — a; , Yijuo aj converges. Since | )7_9 aj| < D079 |@,|, for all 


ty) No 9 ajh = Se o lajl- 


Absolutely convergent series are generally as well behaved as series 


with non-negative terms. The comparison test, D’Alembert’s ratio test and 
Cauchy’s test can clearly be used to test for absolute convergence. For exam- 
ple, if z € C then bis ar 2) /9! converges absolutely; we again denote the sum 
by exp(z). Thus we have defined the exponential function for all complex z. 

A series F=0 a; is said to be conditionally convergent if it converges, but 
does not converge absolutely. 
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Proposition 4.3.2 If Lo a; 1s a conditionally convergent real series 
then 73294; = +00 and ya; = +00. 

co + 
$009 
and that ye FL0 a; converges to s_. Suppose that M > 0. There exists no 


such that 3~"_, at > M + s_ for n > no, so that 
j=0 


n n n 
i=) af —5 a, >) a;—s_>M for n > no. 
j=0 j=0 j=0 


Proof At least one of the sums must diverge. Suppose that 5° = +00 


Thus (s,)°2, is unbounded, giving a contradiction. A similar argument 
applies if }0}°9 a; = +00 and 724 at converges. 


Thus conditional convergence depends on cancellation of positive and neg- 
ative quantities, and arguments are generally more delicate. Fortunately 
there are some useful tests for convergence; the conditions that are imposed 
are all-important. 


Theorem 4.3.3 (The alternating series test) Suppose that (aj;)?9 is a 
decreasing null sequence of positive real numbers. Then >» 7~0(-1)/a; con- 
verges, to s, say. Further, the sequence (San+41)?29 increases to s, and the 
sequence (S2n)°° 9 decreases to s. 


Proof Since 


San4+1 = San—1 + (Gan — Gan41) > San—1 and 


S2n4+2 = San — (Gan4+1 — Gon) < San, 


the sequence (2741)? is increasing and the sequence (S27)? is decreasing. 
Since 


$on4+1 = $2n — A2n+1 S San < 80 


and Son42 = Santi + Qan41 > Santi = $1; 


the sequence (S2n+41)?29 is bounded above and the sequence (s2n)°29 is 
bounded below. Consequently, they both converge, as n — oo. Since 
52n — $82n+1 = G2n41 —- 0asn — ov, the limits are the same, and s, converges 


to the common limit. 


This result has the benefit that if we calculate so, and s2n41 then we know 
that san41 < $ < San, and so we have an estimate of the error. But in practice 
this estimate is usually too crude to be useful. 

The next three tests extend this result, and also apply to complex series. 
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Theorem 4.3.4 (Hardy’s test) Suppose that (a;)729 is a null sequence 
of complex numbers for which SY |aj — aj-1| < 00, and that (z;)F2o 
is a sequence of complex numbers for which the sequence of partial sums 


(io 21) no ts bounded. Then D77° 9 ajzj; converges. 


Proof This is a result whose proof is almost forced upon us. Since we do not 
know what the sum should be, we use the general principle of convergence. 
Thus we consider a sum of the form 


Sn — Sm = Am412%m4+1 +°++ + AnZn. 
Let th = S09 2j and let sn = )0¥_9aj2; for n € Zt. We do not have 
information about the terms z;, but we do know that there exists M such 


that |tp| <M for all n € Z*+. Now z; = t; — t;-1. We therefore substitute, 
and rearrange: 


Sn — $m = 


= Gita (td tin) Free CPN CS tn—1) 
= —Am+itm + (Gina = com pee Si eae (apes = Gin Ena + Antn. 
This equation (and others of a similar form) is known as Abel’s formula. 


Suppose that « > 0. There exists no such that )°%°,,.41 |@j — aj-1| < 
€/3(M +1) and |an| < €/3(M +1), for n > no. If n > m > no then 


n—-1 
I8n — 8m| < |@m-+a]-[tm| + S° (aj — aj4a]-[tj| ] + lan|-len| 
jg=m+1 
n-1 
< | lam+il + S- (aj — aj41| | + lan] | M <e. 
jg=m+1 


Convergence therefore follows from the general principle of convergence. 


Theorem 4.3.5 (Dirichlet’s test) Suppose that (a;)?2 ts a decreasing null 
sequence of positive real numbers and that (25) Fo is a sequence of complex 
numbers for which the sequence of partial sums (F-0 2;)p29 is bounded. 
Then Lo a;z; converges, to s say. Further, if 8m = 0 a;z; and M = 
sup, |tn| then |$ — 8m| < 2am4iM. 


Proof Since SY", |aj — aj-1| = S3724(aj-1 — aj) = ao, the first state- 
ment follows from Hardy’s test. Let tp = i0 zj. Using Abel’s formula, we 
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see that 


[Sn = Cea < 


< [Gpiacttre| + |Cewaa = Gim42)tm41| seat at ae ayaa = Gn Jin —4| + laytn| 


< (Omi + (Qm+41 = Am+2) ee (Gyi—1 An) + Qn) M = 2dm4iM. 


Thus |s — 8m| = limn—soo |8n — 8m| < 2am4i1M. 


Theorem 4.3.6 (Abel’s test) Suppose that (aj)o is a decreasing 
sequence of positive numbers and that pa, 2; converges. Then =o a; 2; 
converges. 


Proof We deduce this from Dirichlet’s test. The sequence (a;)72.9 converges: 
let a be its limit. Since the sequence of partial sums (F=0 2; )p~9 is bounded, 
it follows from Dirichlet’s test that )77°9(a; — a)z; converges. But )77° 9 az; 
converges. Adding, we obtain the result. 


Exercises 


4.3.1 Do the following series converge? 


y 


eS 


4.3.2 Prove Abel’s test directly, without appealing to Dirichlet’s test. 

4.3.3 Suppose that (a;)729 and (2;)729 satisfy the conditions of Abel’s test, 
and that )°7°.9 aj; =t. Find an upper bound for | }7"_9 ajz; — ¢]. 

4.3.4 Suppose that 05° 9 23 ” converges absolutely. Show that ye je0 73/(F +1) 
converges absolutely. 


4.4 Iterated limits and iterated sums 


A real-valued function f on N x N or on Z* x Z* is called a double sequence; 
we frequently write (fmn)po—172 for f, where finn = f(m,n). Suppose that 
(fm.n)re=1n21 is a double sequence. Suppose that fmn— gn as m — ov, for 
each n € N and that g, — g as n — oo. Suppose also that fmn — hm as 
n — oo, for each m € N and that hy, — has m — oo. Does it follow that 
g=h? 
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Simple examples show that the answer is ‘no’. For example, let fm = 1 
ifm <nand let finn =0 ifm > n. Then 


lim (lim te) = ie =i 
m—oo n—- oo m— Oo 


lim (lim finn) = lim 0=0. 
n—- Co m—- Coco nN—CoO 


Thus even when the iterated limits exist, the value can depend on the order 
in which the limits are taken. 

The same phenomenon occurs with sums. Let fmn = 1 if m = n, let 
tna = Hl” win Sor and let: {7,4 — Ot mi = ne Then 


y (> fan) = Soa ms 
=1 m=1 


m=1 \n 
>| fan) = Sr0=0 
n=1 1 n=1 


These examples show that we cannot always interchange the order in which 


oe) 


m= 


we take limits. On the other hand, if certain conditions are satisfied, then the 
same value is obtained, independent of the order in which the limits are taken. 
In the exercises, examples of this are given. 


Exercises 


4.4.1 Suppose that {a;, : (j,k) € Z* x Z*} is a double sequence of non- 
negative numbers. Show that the following are equivalent. 
(a) SogLo aj converges for each jf € ZT, and DUP o (Die aye) 
converges. 
(b) So}20 @je converges for each k € Zt, and DP o( Eo aye) 
converges. 
(c) The set {7h 9 oko ax) 2 2 © ZT} is bounded. 
Show that if these conditions are satisfied then 


(oe) [o-e) (oe) (oe) 
yO Gn = Oy ae 
j=0 k=0 k=0 j=0 
4.4.2 Suppose that {a;, : (j,k) € Z* x Z*} is a double sequence of non- 
negative numbers. Find a sufficient condition, corresponding to the 
condition in the previous example, for the two limits 


lim (lim am) and lim (lim am) 


mo \n—-co noo \Mm—->oo 


to exist and be equal. 
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4.4.3 Suppose that {zx : (j,k) € Zt x ZT} is a double sequence of complex 
numbers. Show that the following are equivalent. 


(a) dogo l2j%| converges for each jf € ZT, and DP o(d eo Zsa) 


converges. 

(b) dojo |zjx| converges for each k € Zt, and Spo Zo lzzal) 
converges. 

(c) The set {)o{z;, : (j,k) € F} : F a finite subset of Zt x Zt} is 
bounded. 


Show that if these conditions are satisfied then 
DLO ain) = DLO aun) 
j=0 k=0 k=0 j=0 


4.4.4 Let aj, = 1/(j? — k*) for (j,k) € N x N with j # k, and let aj; = 0 
for 7 € N. By writing 


1 1 1 1 
= for 7 #k, 
PoP oa rarer se 
show that S772, aj, = —3/4j?, and show that the series converges 
absolutely. Deduce that S772 (0721 @jx) converges. Is 


O° Ajk) =| yO Ajk)? 
j=l k=l k=1 j=l 


4.5 Rearranging series 


What happens if we try to add the terms of an infinite series in a different 
order? 


Theorem 4.5.1 Suppose that ya 2; converges absolutely, and that 
“2; = 8. Ifo is a permutation of Zt then S<~, z4/;) converges to s. 
j=0 “3 j=0 “o(9) 


Proof By considering real and imaginary parts, it is enough to consider an 

absolutely convergent real series Lo a;. First consider the case where all 

the terms are non-negative. Ifn € Z* and k = sup{o(j) : 1 < 7 < n} then 
k 

i =0 %(7) S Vino % < 8. Thus )7P5 a,(;) converges, and 05" 9 aay) < 5. 

By the same token, 
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In the general case, write a; = gr = a; . Then a,(;) = = afi) — a5) so that 


j 
y= Ae(j) Converges to 75" 5 ar) _ Ee, a,j), and 


ie) Ox CO CO CO CO 

_ = + oe ; 
> 40(3) = 3) — Gy = DG wr =>) 4. 
j=0 j=0 j=0 j=0 j=0 j=0 


When Lo a; converges conditionally, the situation is completely dif- 
ferent. Let us give an example. Let a; = (—1l)/t!/Vj, for 7 € N. 
Then 


‘a ee 1 1 
- foe + 4 
v2 V3 V4 V2j-1 25 


converges, by the alternating series test. Let us rearrange the terms, taking 


1 


two positive terms and one negative one, and repeating, to give the series 


1 re ee Hs shat a take 1 
3. OP Pe af [ee AGT, GR OG 


(1+ y+ 


1 1 1 1 
: >] as Jj > 0O 
4i(laa Jit a) oes 
so that there exists jo such that 


1 1 1 1 
Vjtil | Jt3 Va Wi 


Thus the sum of the rearranged terms diverges to +00. 
This sort of phenomenon is quite general. Let us illustrate this by giving 


for j = jo. 


one result for real series, which also indicates that there are many other 
possibilities. 


Theorem 4.5.2 Suppose that ey a; 1s a conditionally convergent real 
series, and thatm < M. Then there exists a rearrangement i Ag(z) such 


that, setting tn = >>" 


j=1 Go(j)> lim infpo tn =m and limsup,,_,., tn = 


Proof We shall describe the idea of the proof, but omit the technical details. 
Let 


P={ji<jo<-}={jeN:a;>0}, Q={k<kg<-} =N\P. 


Then ied Qj, = pa 14 i = +00 and we i(- Ar) = ye 14 ge = +00. 
Let us suppose that M > 0. Let i; be the least integer such that )~)" aj, > 
M. We define o(i) = j; for 1 < i < i,. Next, let 1, be the least integer such 
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that: 52 ash ae ay, < _m. We define o(4, + j) = kj for 1 < 7 < l,. We 
now iterate this procedure, so that the partial sums oscillate between values 
greater than M and values less than m. The procedure does not terminate, 
since the sums )°7°, aj, and }°7=,(—ax,) are infinite. The resulting mapping 


o from N to N is then clearly bijective. Finally, since the sum a aj is 
fl 
‘overshoots’ tends to 0, so that lim infyp.. t, = m and limsup,,_,,, tn = M. 


convergent, the sequence (a;) is a null sequence. Thus the size of the 


If M < 0, we start by finding a sum less than m, and then proceed as 


above. 


In particular, we can rearrange the series to converge to any limit whatever. 


Corollary 4.5.3 Jfl¢R, there exists a rearrangement ee Ag(j) which 
converges to l. 


Proof Takem= M = 1. 


Exercises 
4.5.1 Let 
ee ae ere a oe 
De ee ale a ed 
Show that 
re oe ee ee 
D 5 a 
4.5.2 Show that 
eco areca meee ee ee ae 
Ties BP ag? ie Us ee ae ee 


4.5.3 Suppose that Lo a; is convergent to s, and that o is a permutation 
of N. 
(a) Suppose that |o(j) — j| < K for all j. Show that S729 aaj) is 
convergent to Ss. 
(b) Let mj; = sup{|a,| : k > j}. Suppose that m;|a(j) — j| > 0 as 
j — oo. Show that Lo d,(;) 18 convergent to s. 
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4.6 Convolution, or Cauchy, products 


The results in this section relate to power series, which we shall consider 
in more detail in the next section. Suppose that (aj)729 and (;)72o are 
sequences of complex numbers. We consider two formal power series 


a(z) =agp + ar +agz7+---, b(z) = bo +biz + box? + --- 


If we formally multiply them, and collect terms together, we obtain 
a(x)b(x) = c(z) =co + ca + cox? +--+, where cj = Ss" O04 ds 


The sequence (¢5)9z 9 is the convolution product, or Cauchy product, of the 
sequences (a;)?29 and (b;)2o. 

Suppose that 0 a; converges to s and that ey b; converges to t. 
What can we say about the convergence of Lo c;? First, if both converge 
conditionally then EG c; need not converge. For example, if aj = bj = 
(—1))//7 +1, then yo 2; and Y17< 9 bj converge, by the alternating series 
test. But 

j+1 


1 
“eee 


Since k(j +2 —k) < (j +2)?/4, it follows that |c;| > 2(7 + 1)/(j +2) = 1, 
and the series }>“° 9 cj does not converge. 
On the other hand, we have the following. 


Proposition 4.6.1 [f )72 9a; and job; are absolutely convergent, 
to s and t respectively, and cj = par aybj_; then Ss c; 1s absolutely 
convergent to st. 


Proof First suppose that a; > 0 and b; > 0 for all 7. Consider the terms 


ajby arranged in a semi-infinite array: 


aobo aobi = agbe 
ajbp ayby  aybe 
ag bo ag by ag bo 


Then c; is the sum of the terms on the diagonal line {(7,k) :i +k = j}. 
Thus un = 0 cj is the sum of the terms in the triangle on and above the 
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line {(7,k) :i2+k=n)}. Thus if m = [n/2] is the integral part of n/2 then 


so that un — st, by the sandwich principle. 
The result now extends to the case where )77°) a; and )77°.9 bj are abso- 
lutely convergent, by considering real and imaginary parts, and splitting these 


into positive and negative parts. 


abn aby, 


agbo 


AyD 
m0, AmnPm 


aD, 


n 
a,b 0 


Figure 4.6. Summing a convolution product. 


Let us apply this to the exponential function. Let a; = a’ /j! and bj = 0’ /j!. 
Then — 
bJ " abi! a'bi~* 
jh G-D!- il(j — i)! 
by the binomial theorem. Thus exp(a) exp(b) = exp(a + b). Consequently 


aj Fy 
Cj = tet = (at bY /II 


exp(z) exp(—z) = 1, so that e* ¥ 0. In particular, if x is real and negative 
then exp(x) = 1/exp(—) > 0. The mapping z — exp(z) is a homomorphism 
of the additive group (C,+) into the multiplicative group (C \ {0}), x) of 
non-zero complex numbers. For this reason, we frequently write e* for exp(z). 

What happens when one series is absolutely convergent and the other is 
conditionally convergent? 


Theorem 4.6.2 If Lo a; 1s absolutely convergent to s and LO b; 
is conditionally convergent to t, and if cj = ee ajbj_; then LO cj is 
convergent to st. 
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Proof Let sn, tn and un denote the nth partial sums of the three sequences. 
The sequence (tn)? is bounded. Let M = sup, |tn|, and let L = 75° |aj|. 
Let m = [n/2]. Now 


II 
3 
Bo 
3 
| 
a 
i=) 
AS 
a 
| 
3. 
| 
M: 
i=) 
RS 
3 
| 
a 
oa 
QS 


Here we first add the rows of the triangle {(i, 7) :i+j <n}, and then add 
the resulting sums. Thus 


Un — Snt = ag(tn — t) +--+ + an(to — 1). 
We split the sum into two parts: Un — Spt = A, + Ag, where 


AL = ao(tn = t) eo Or tae - ts 


and Ay = pt tA t) bres An (to £): 


We consider the two sums separately. Given € > 0, there exists ng such that 


= € € 
|| d t, — t| < ——— f > no. 
els ue EN ae 


If n > 2no, then m > no and n — 7 > no for 0 < 7 < m, so that 


m 


€ 
[Ai] < appr a lal) < €/3. 
0 


Further, 
|A2] < 2M( S~ ajl)) < 2€/3 
j=m+1 


so that Un — Ssnt > 0 as n — oo. Since up, — st = Un — Snt + (Sn — 8)t, it 
follows that u, — stasn — o. 


The technique of this proof, where we divide a sum into two parts, and 
consider each part separately, is one that is used in many areas of analysis. 
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Exercises 


4.6.1 Let aj; = b; = (—1)?/(g + 1) and let cj = oe ajb;-;. Show that 


4 1 1 
ees (tg tags 


Show that (|cj|)?2 is a decreasing null sequence. Deduce that )77°9 cj 
converges. 

4.6.2 Suppose that a, — a as n — oo. Let 8, = ag +--+ + Gy. Show that 
$8n/(n +1) ~aasn— oo. 
Suppose that an — a and that b, — b as n — oo. Show that 


ee eee en) aD a 0 


Suppose that (cj) Fro is the convolution product of the sequences (aj) eo 
and (6;)729, and that Sag a; is conditionally convergent to s and 
Lo b; is conditionally convergent to t. Let un = ¢9 +++: + €n. 
(a) Show that up +--+ + Un = Sotn +--+ + Snto. 

b) Show that (ug +--+ +u,y)/(n+1) > st asn— oo. 
( 

c) Show that if °c; converges, then its sum must be st. 

j=0 “I 


4.7 Power series 


A power series is an expression of the form )°?° 4 Gn(z — 20)”, where (an) °29 
is a sequence of complex numbers, zo is a complex number, and z is a complex 
number, which we also allow to vary. (In fact, in many circumstances we shall 
consider complex power series for which the coefficients ay, are real.) We are 
interested in the values of z for which the power series converges. For this it 
is clearly sufficient to consider the case where zo = 0. 

We introduce some notation. If0 < R < co weset Ur = {z € C: |z| < R}, 
the open disc of radius R with centre 0, and we set U. = C. Thus U; = D, 
the open unit disc. 

Let us begin with a very simple example. Consider the power series 
rg 2”. If |z| > 1 then |z”| does not tend to zero, and so the power series 
diverges. If |z| < 1, then 


n n+l 1 [zie 


Sot 
he > aries so that |sp, it T= 


so that S°?°_, 2” converges to 1/(1 — z). Thus }>°° 9 2” converges if and only 
if z is in the open unit disc D = {z: |z| < 1}. 
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We can however say more. If |z| < 1 then 


do le"1= dole” = 1/01 - lel), 
n=0 n=0 


so that the series }*°° 9 2” converges absolutely. 
Suppose that )>?° 4 anz” is a complex power series. For what values of z 
does it converge? To answer this, it is convenient to consider the set 


B= {r € [0,00) : (anr”)?- is a bounded sequence}. 


0 € B, and if r € B then [0,r] C B. Thus B is an interval. If B is bounded 
we set R = sup B. R may or may not belong to B. If B = [0,00), we set 
R= oo. Ris called the radius of convergence of the power series }°7° 9 an2”. 
The next theorem explains the reason for this name. 


Theorem 4.7.1 Suppose that S7°° 9 anz" is a complex power series with 
radius of convergence R. If z € UR then Yyr° 9 anz”" converges absolutely. If 
|z| > R then yo 9 anz” does not converge. 


Proof If |z| > R then (a,z")?29 is unbounded, and so the power series 
diverges. (In particular, if R = 0 then the series only converges when z = 0.) 
Suppose that |z| < R. There exists s such that |z| < s < R, and so M, = 
SUPnez+ |ans”| < co. Let r = |z|/s, so that 0 <r < 1. Then 


l@nz"|=lans"r"| < Mar” for ne N. 


By the comparison test, the series }*?° 9 |an.z"| converges, and so )7P° 4 dn2” 


converges absolutely. 


Note that the proof depends only on the convergence of a geometric series. 
This simple idea is very powerful, and we shall use it, and the convergence 
of series such as }>°° 9 n'y”, where 0 <r <1andk €N, many times in the 
future. 

We have the following formula for the radius of convergence. 


Theorem 4.7.2 Suppose that \>>° 9 anz" is a power series with radius of 
convergence R. Let A = limsup |a,|!/". If A = 0 then R = oo. If A = co 
then R=0. Otherwise, R=1/A. 


Proof This is just a matter of teasing out the definitions. Suppose that 
A < co and that S > A. Then there exists no such that |a,|!/" < S for 
n> no. Thus |a,|/S” < 1 for n > no; the sequence (a,/5”)°2, is bounded, 
and so 1/S < R. Since this holds for all S > A, R = oo if A = 0, and 


128 Infinite series 


R > 1/A otherwise. Suppose that A > 0 and 0 <s < A. Let s<t< A. Then 
|a,|'/" > t for infinitely many n. Thus |a,|/s" > (t/s)” for infinitely many 
n; the sequence (a,,/s")°° 9 is unbounded, and so 1/s > R. Since this holds 
for alls < A, R=Oif A= oo, and R < 1/A otherwise. 


The theorem says nothing about convergence on the circle of convergence 
Cr = {z € C: |z| = R}. There are many possibilities, as the following 
examples show. 


1. Sore yg ntz”. Since (n!r”)°9 is unbounded for all r > 0, R = 0, and the 
series only converges when z = 0. 

2. ong nz”. Here B = [0,1) and R = 1. The sequence (nz”)?2o is 
unbounded for each z € Cj. 

3. 72 2”. Here B = [0,1] and R=1. 2” 40 as n-— oo for each z € C\. 

A, yore 9 2"/n. Here B = [0,1] and R = 1. $5 2"/n diverges when z = 1. If 
z€C; and z #1 then 


so that the sequence (79 zI)°°_, is bounded. Consequently, the series 


rg 2"/n converges, by Dirichlet’s test (Theorem 4.3.5). 

5. S3°° , 2"/n?. Here B = [0,1] and R = 1. The series converges uniformly 
on {z€C: |z| <1}. 

6. on 2” /n!. Here B = [0,00) and R = oo. The function e* = e(z) = 
9 2” /n! is the exponential function. 


If ore yp anz” and S75 bnz” are power series, we can form the sum 
[oe 
aan Un)2". 


Proposition 4.7.3 Suppose that )>°° 9 anz" has radius of convergence R 
and yor. 4 bnz” has radius of convergence R’. If RA R' then the radius of 
convergence of Y-r°o(an + bn)z" is min(R, R’); if R = R’ the the radius of 
convergence is greater than or equal to R. 


Proof The proof is left as an exercise. 


If or ganz” and S>P° 9 bnz” are power series, we can form the formal 
product > oP" 9 en2”, where cn, = >> j=0 2 byn—j, a8 in the previous section. 


Theorem 4.7.4 [f 7°.) an2” has radius of convergence R and Y>P~ 4 bnz” 
has radius of convergence R’ then the formal product \>>° 9 G,z" has radius 
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of convergence greater than or equal to min(R, R’). If |z| < min(R, R’) then 


O- anz”")(S> ie |= S ene 
n=0 n=0 n=0 


Proof Let R” be the radius of convergence of )77°_4 ¢nz”. If |z| < min(R, R’) 
then all three series converge absolutely, by Proposition 4.6.1, and 


es ie, bya”) = > CZ”: 
n=0 n=0 n=0 


Hence R” > min(R, R’) 


We shall consider power series further in Section 6.6, and in Volume III. 


Exercises 


4.7.1 Prove Proposition 4.7.3. 
4.7.2 Find the radii of convergence of the following power series: 


re) a) ' a) 


n=0 n=0 


oe gan 
DB 2"(n+1) 


n=0 


4.7.3 What is the radius of convergence of the power series 


At which points, if any, of the circle of convergence does it converge? 
4.7.4 Suppose that an4i/an — XA as n — oo. What is the radius of 
convergence of )7°° 9 anz”? 
4.7.5 What are the radii of convergence of the power series 


Lege? 44 48 4 end l= ge 2? = a? 


What is the radius of convergence of their product? 
4.7.6 Suppose that the series }°°° 9 anz” has non-zero radius of convergence 

H. Let f (2) = yop tne" for |a|.< R. 

(a) Show that if the coefficients a, are real, then f(Z) = f(z) for |z| < 
ice 

(b) Show that f is even — that is, f(z) = —f(<z) for |z| < R-— if and 
only if a, = 0 for n odd. 

(c) Show that f is odd — that is, f(z) = —f(z) for |z| < R- if and 


only if a, = 0 for n even. 
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(d) Suppose that f 4 0. Show that there exists 0 < r < R such that 
F(z) 4:0 for:0 <"|z| <r, 

4.7.7 Suppose that the power series }>?° 9 @nz"” has radius of convergence R. 
Let sy, = De 5 a;. Investigate the radius of convergence of the power 
series yr 9 Sn2”. 

4.7.8 Let 

DF 

n2” 

Show that if |z| = 1 and z 4 1 then 


y= ha =) Ge = for 2” <7 <2"tl andneN. 


J 
1 
k n - n+l 
p>, ee | nba] = fon 2 sf as 


Show that 7,9 @nz” converges conditionally, for all z with |z| = 1. 


5 
The topology of R 


In this chapter, we consider some particular sorts of subsets of R, and their 
relation to convergence. This involves many definitions; familiarity will only 
come with use. We study the ideas that arise here in a more general setting 
in Volume II. 

5.1 Closed sets 


We begin by considering intervals in R. A subset J of R is an interval if 
whenever two numbers belong to it, then so do all the intermediate points: 
that is, ifa<c<banda,b€Jthenc € J. R is an interval. The empty set 
and singleton sets are degenerate intervals. Other examples of intervals are 
the semi-infinite intervals 


(—o0,b) = {x ER: z < db}, (—00,b] = {x ER: « < 5}, 
(2,0) =e eR:a<¢}, la,ce)={r eR: a<s}, 
and the bounded intervals 
(a,b) = (b,a) ={x#@ CER: a<z < d},(a,)] =[b,a) ={t eR: a<2<}}, 
(a,b) = (bal ={x@eER:a<z2< d}Ja,b] =[bal={teR:a<2< bd}, 


where a < b. It is an easy exercise to show that every interval is of one of 
these forms. The length of a bounded interval is b — a; the length of R and 
of semi-infinite intervals is +-oo. 

Note that if Z is a set of intervals then Nyez/ is an interval, and that if 4 
and J, are intervals with 9 I2 4 @ then J; U Jz is an interval. 

Next, we consider the closure of a subset of R. A real number 0 is called a 
closure point of a subset A of R if whenever ¢ > 0 there exists a € A (which 
may depend upon e€) with |b — a| < «. Thus 6 is a closure point of A if there 
are points of A arbitrarily close to b. If b € A, then 6 is a closure point of A, 
since we can take a = 6 for any € > 0. 
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We can use convergent sequences to characterize closure points. 


Proposition 5.1.1 Suppose that A is a subset of R and that be R. b is 
a closure point of A if and only if there exists a sequence (aj)ey in A such 
that a; > b as j > ow. 


Proof Suppose that there exists a sequence (aj) Ro in A such that a; — b 
as j — oo. Suppose that « > 0. There exists jo such that |b — a;| < ¢ for 
j = jo. Take a = a;,. Thus 6 is a closure point of A. 

Conversely, if 6 is a closure point of A then for each 7 € N there exists 
a; € Awith |b — a;| < 1/7. Then a; — bas j > ov. 


The closure A of A is the set of closure points of A. A is asubset of A since 
each point of A is a closure point of A. A subset A of R is said to be closed 
if A= A: 

Proposition 5.1.2 A subset A of R is closed if and only if whenever 
(Gn)°@, is a sequence in A which converges to b, then be A. 


Proof This is an immediate consequence of Proposition 5.1.1. 


In other words, a subset A of R is closed if and only if it is closed under 
taking limits. For example, the interval [a, b] is closed, since ifa < 2; < band 
Lj — ©as j > oo thena < x < b, by Theorem 3.2.5. (This accords with our 
use of the term closed interval in Section 3.2.) Ifa < b, and 2; = a+(b—a) /2/ 
for j € Zt then «; € (a,6], and xz; — aas j — oo. Thus (a, }] is not closed, 
since a ¢ (a,b]. The set Q of rational numbers is not closed, since if x is any 
irrational number then by Corollary 3.2.7 there exists a sequence of rational 
numbers which converges to x, so that Q = R. A subset A of a subset B of 
R is dense in B if B C A. Thus Q is dense in R. 


Proposition 5.1.3 Suppose that A and B are subsets of R. 


(i) If AC B then ACB. 
(ii) A is closed. 
(iti) A is the smallest closed set containing A: if C is closed and A C C 
then ACC. 


Proof (i) follows trivially from the definition of closure. 
(ii) Suppose that b is a closure point of A and suppose that ¢€ > 0. Then 

there exists c € A such that |b — c| < €/2, and there exists a € A with 

|c — a| < €/2. Thus |b— a] < €, and so be A. 
(iii) By (i), AC C=C. 
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Here are some fundamental properties of the collection of closed subsets 
of R. 


Proposition 5.1.4 (i) The empty set 0 and R are closed. 

(ii) If A is a set of closed subsets of R then N4csA is closed. 

(iti) If {A1,..., An} is a finite set of closed subsets of R then A = U"_, A; 
is closed. 


Proof (i) The empty set is closed, since it has no closure points, and R is 
trivially closed. 

(ii) Suppose that b is a closure point of N4c4A, and that A € A. If e > 0 
then there exists a € N4c4A with |b — a| < ¢. But then a € A. Since this 
holds for all € > 0, a € A= A. Since this holds for all A € A, bE Nyc. 

(iii) Suppose that b ¢ A. If1 <j <nthenb ¢ A; = Aj, and so there exists 
€; > O such that if |b—c| < €; then c ¢ A;. Let € = min{e;:1 <j < n}. 
Then ¢€ > 0, and if |b— ce] < € then c ¢ U_, Aj = A. Thus 6 is not a closure 
point of A; every closure point of A is in A, and so A is closed. 


Corollary 5.1.5 A finite subset of R is closed. 


Proof The empty set is closed, and a singleton set {a} is closed, since if b 4 a 
then, setting e = |b— al, {a} {x : |x — b| < e} = 0. Now apply (iii). 


Let us give another example. 


Example 5.1.6 Suppose that (aj) eo is a sequence of real numbers 
convergent to a. Let S = {a;:j € Zt}. Then S = Su {a}. 


By Proposition 5.1.1, a € S. Suppose that b ¢ S'U{a}. We shall show that 
b is not a closure point of S. Let 7 = |b — a|/2: then n > 0. There exists jo 
such that |a; — a] < 7 for j > jo. Then by the triangle inequality, 


|b — a;| > |b— a] — la; — a| > 2n -—n =n, for j > jo. 


Let € = min(7, min{|b — aj|: 1 < 7 < jo}). Then € > 0, and ifs € S then 
|b — s| > €. Thus 6 is not a closure point of S. 


Proposition 5.1.7 Jf A is a non-empty subset of R which is bounded 
above then sup A € A. 


Proof For each j € N there exists a; € A with sup A — 1/j < aj < sup A. 
Then a; — sup A, so that sup, € A. 


We can also consider subsets of a subset X of R. Suppose that A is subset 
of X. Then the relative closure of A in X is the set AN X of closure points of 
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Awhich are in X. The set A is relatively closedin X if it is equal to its relative 
closure. Relatively closed sets can be characterized in the following way. 


Proposition 5.1.8 Suppose that A is a subset of a subset X of R. Then 
the following are equivalent: 


(i) A is relatively closed in X; 
(ii) there exists a closed subset F of R such that A= FOX; 
(iii) if (an)? is a sequence in A which converges to a point b of X then 
bE A. 


Proof This is a worthwhile exercise for the reader. 


Exercises 


5.1.1 Verify that every interval in R is of one of the forms described at the 
beginning of this section. 

5.1.2 Show that a subset J of R is an interval if and only if whenever a,b € I 
and 0 <t<1then (1—t)a+tbe I. 

5.1.3 Suppose that a C R. Show that the following are equivalent. 
(a) A is closed. 
(b) If [a, 6] is a closed interval for which AM [a, 6] is non-empty then 
sup(AN [a, b]) € A and inf(AN [a, bj) € A. 

5.1.4 Suppose that A is a non-empty closed subset of R and that b € R. 
Show that there exists a € A such that |a — 6| = inf{|x — a]: a € A}. 
Is a unique? 

5.1.5 If A and B are non-empty subsets of R, we set A+ B = {a+b: 
a€é A,be B}. 
(a) Give an example of closed sets A and B for which A + B is not 
closed. 
(b) Show that if A is closed, and B is closed and bounded, then A+ B 
is closed. [Hint: Use the Bolzano—Weierstrass theorem. ] 

5.1.6 Suppose that (Aj) Fo is a sequence of subsets of R. Show that 


U?_9 Aj = Ufo Aj and that U% 4; D UPL Aj. 


Give an example to show that the inclusion can be strict. What about 
intersections? 

5.1.7 Suppose that x is an irrational number. Let a, = {nx}, the fractional 
part of nx. Use the pigeonhole principle to show that if « > 0 then there 
exist m,n such that |am — ay| < €. Show that {a, : n € N} is dense in 
(0, 1]. 

5.1.8 Using Proposition 5.1.2, give proofs of Propositions 5.1.3, 5.1.4 and 
5.1.7 which use convergent sequences. Do the same for Example 5.1.6. 
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5.2 Open sets 


Suppose that a € R and « > 0. We define the open €-neighbourhood of a to 
be the set of all numbers distant less than € from a: 


N.(a) = {x ER: |x -al < e}. 


N-(a) is the interval (a — €,a + €). We can express convergence in terms of 
e-neighbourhoods; a; — a as j — oo if and only if for each € > 0 there exists 
jo such that a; € N.(a) for j > jo. Similarly, the closure of a set is defined in 
terms of e-neighbourhoods: a € A if and only if N.(a) A ¥ O, for all € > 0. 

Suppose that A is a subset of R. An element a of A is an interior point of 
A if there exists « > 0 such that N.(a) C A. In other words, all the numbers 
sufficiently close to a are in A; we can move a little way from a without 
leaving A. The interior A° of A is the set of interior points of A. A subset 
U of R is open if U = U®. The interval (a,c) = {bh Ee R:a<b< chis 
open: if b € (a,c), we can take € = min(c — b,b — a). In particular, an open 
e-neighbourhood N;,(2) is open. 

The collection of open sets of R is called the topology of R. Properties that 
can be defined in terms of the open sets are called topological properties. The 
word ‘topology’ is also used to describe the study of topological properties. 

The interval (a, 6] is not open: there is no suitable ¢ for b. Thus (a, }] is 
an example of a set which is neither open nor closed. The set Q of rational 
numbers is also neither open nor closed: we have seen that it is not closed, 
and it is not open, since if r € Q and € > 0 then the open e-neighbourhood 
N.(r) contains irrational points (see Exercise 3.1.2). 

‘Interior’ and ‘closure’, ‘open’ and ‘closed’, are closely related, as the next 
proposition shows. Recall that we denote the complement R \ S' of a subset 
S of R by C(S). 


Proposition 5.2.1 Suppose that A and B are subsets of R, and that 
aeR. 


(i) If AC B then A° C BY. 
(ii) C(A®) = CA). 
(itt) A is open if and only if C(A) is closed. 
(itt) A° is open. 
(iv) A° is the largest open set contained in A: if U is open and U C A then 
UGCA": 


Proof (i) This follows directly from the definition. 
(ii) If b g A° then N.(b) N C(A) # O for all « > 0, and so b € C(A). 
Conversely, if b € C(A) then N.(b) 1 C(A) # @ for all € > 0, and so 
bd A°. 


136 The topology of R 


(iii) If A is open then C(A) = C(A°) = C(A), by (ii), and so C'(A) is closed. 
If C(A) is closed then C(A°) = C(A) = C(A), so that A° = A. 

(iv) C(A°) = C(A) is closed, so that A®° is open, by (iii). 

(v) By. G@);U=U° CA’, 


Corollary 5.2.2 (i) The empty set 0 and R are open. 
(ii) If A is a set a open subsets of R. then Ugc4A is open. 
(itt) If {Ay,..., An} is a finite set of open subsets of R then M91 A; ts open. 


Proof Take complements. 


If A is a subset of R then the frontier or boundary OA of A is the set 
A\ A®. Since 0A = ANC(A), OA is closed. x € OA if and only if every open 
e-neighbourhood of x contains an element of A and an element of C(A). 

We can also consider subsets of a subset X of R. Suppose that A is subset 
of X. Then a point a of A is an interior point of A relative to X if there 
exists € > 0 such that N.(a) MX C A. The set of interior points of A relative 
to X is then the relative interior of A in X, and A is relatively open in X if 
it is equal to its relative interior in X. 


Relatively open sets can be characterized in the following way. 


Proposition 5.2.3 Suppose that A is a subset of a subset X of R. Then 
the following are equivalent: 


(i) A is relatively open in X; 
(ii) there exists an open subset U of R such that A=UNX; 
(itt) the set X \ A is relatively closed in X. 


Proof Again, this is a worthwhile exercise for the reader. 


Exercises 


5.2.1 Suppose that A and B are subsets of R and that A is open. Show that 
A+ B is open. 

5.2.2 Suppose that A is a subset of R. Show that by repeatedly taking clo- 
sures and interiors, we can obtain at most six more different sets. Give 
an example to show that six more different sets can be obtained. 


5.3 Connectedness 


We now establish a fundamental characterization of non-empty intervals, in 
terms of open sets. This will allow us to say more about open sets. We need 
some more terminology. A non-empty subset A of R splits if there exist two 


5.3 Connectedness 137 


disjoint open subsets U, and U2 of R such that A C Uy; UU2 and ANU; and 
ANU, are non-empty. If A does not split, it is connected. 


Theorem 5.3.1 <A non-empty subset A of R is connected if and only if it 
is an interval. 


Proof Suppose first that A is not an interval. Then there exist a < b < c¢ 
such that a,c € A and b ¢ A. Let Uj = (—oo, b) and let Uz = (b, +00). Then 
U; and U2 are disjoint open sets and ANU, and ANU» are non-empty. Thus 
A splits. 

Conversely, suppose that A splits. Thus there exist disjoint open subsets 
U, and U2 such that A C U; UU2, and AN U, and AN U2 are non-empty. 
Let ay € ANU, and ag € ANU». Without loss of generality, we can suppose 
that a, < ag. Let b = sup(U; NM [a1, a2]). We shall show that a, < b < a2 and 
that b g A, so that A is not an interval. First, there exists 0 < 6 < ag — a, 
such that (a1 — 6,a1 +0) C Uy. Thus b > ai + 8 > ay. Secondly, there exists 
0 < € < ag—a, such that (a2—€, ag +e) C U2; thus b < ag. Suppose if possible 
that b € U,. Then there exists 0 < 7 < ag — b such that (b— 7,b+ 7) C U4. 
Then (b,b+7) C U, NM [a1, ag], contradicting the definition of b. Thus b ¢ Uj. 
Suppose if possible that b € Uz. Then there exists 0 < ¢ < 6 such that 
(b—¢,b+¢) C Ug. Then b — ¢/2 is an upper bound for U; /M [aj, a2], again 
contradicting the definition of b. Thus b ¢ U; U U2, and so b € A. 


Corollary 5.3.2. A subset A of R is both open and closed if and only if 
A=9 or A=R. 


Proof We have seen that @ and R are both open and closed. If A is open and 
closed then C(A) is open and closed. R = AU C(A); since R is connected, 
either A = @ or C(A) = 9. 


Open subsets of R can now be characterized as disjoint unions of open 
intervals. 


Theorem 5.3.3 Suppose that U is a non-empty subset of R. U is open 
if and only if there is a countable set Z of disjoint open intervals such that 
U = Uyerl. The set TZ is uniquely determined. 


Proof A union of open intervals is open, by Corollary 5.2.2, and so the 
condition is sufficient. Suppose that U is open. We define an equivalence 
relation on U by setting a ~ b if [a,b] C U (here we allow the possibility that 
a > b, in which case [a,b] = {c€ R: b<c< a}). Ifa~ b then b ~ a, since 
[a,b] = [ba]; if a~ band b~ c then a ~<«, since [a,c] C [a,b] U[b,c] C U. 


Thus ~ is an equivalence relation on U. Let Z be the set of equivalence classes. 
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If J € Z, then J is an interval, and J and U \ J are disjoint. If a € J, then 
there exists « > 0 such that N.(a) C U. If b € N.(a) then [a,b] C N.-(a) and 
so b € I. Thus N,(a) C J, and I is open. Thus U is a disjoint union of a set 
T of open intervals. 

If re UNQ, let I, be the equivalence class to which r belongs. Since 
a non-empty open interval contains rational points (between any two real 
numbers, there is a rational number), the mapping r > I, : UN Q — T is 
surjective. Since UM Q is countable, so is Z. 

The representation is unique. Suppose that U = Ujye7J, where J is a set 
of disjoint open intervals. Suppose that J € 7, and that « € J. Then a € J, 
for some I € ZT. If y € J then [x,y] C J CU; hence y € I and J C I. Further, 
I=JU((U\J)N 1). Since J and (U \ J) are disjoint open subsets of J, and 
I is connected, J = J. Hence J = 7. 


Exercises 


5.3.1 Suppose that A is a non-empty subset of R. Show that A is connected 
if and only if whenever F; and F are closed subsets of R whose union 
is R then either A C F\ or A C Fh. 

5.3.2 Suppose that G is a proper closed subgroup of (R, +) and that G 4 {0}. 
Suppose that (a,b) is a connected component of R \ G. Show that 
G = {n(b-—a):ne€ ZG}. 

5.3.3 Suppose that F’ and G are closed subsets of R, that [co, dog] C FUG and 
that co € F,do € G. If (co + do) /2 € F, set cy = (co + do) /2, d, = do; 
otherwise set cj = co, d, = (co + do)/2. Repeat recursively. Show that 
there exists 6 € cp, do such that c, — b and d, — bas n — oo. Show 
that 6b € FG. Use this to give another proof of Theorem 5.3.1. 

5.3.4 Suppose that {Oa}aea is a family of disjoint non-empty open subsets 
of R (if a 6 then ON Og = 9). 

(a) Show that A is countable. 

(b) Suppose that, for each a, Oo = Urer, 1, where Ty is a set of disjoint 
non-empty open intervals. Show that 7 = Ugea4Za is a set of disjoint 
non-empty open intervals whose union is UgeA4Og. 


5.4 Compact sets 


We now use the Bolzano—Weierstrass theorem to obtain some important 
results about the bounded closed subsets of R. Suppose that A is a subset of 
a set X and that B is a set of subsets of X. We say that B covers A, or that 
B is a cover of A, if A C UpegB. A subset C of B is a subcover if it covers A. 
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A cover B is finite if the set 6 is finite. If X = R, a cover B is open if each 
B € Bis an open set. 


Theorem 5.4.1 Suppose that U is an open cover of the bounded closed 
interval [a,b]. Then there exists 6 > 0 such that if x € [a,b] then there exists 
U EU such that Ns(x) CU. 


Proof Suppose not. Then for each n € N there exists 2, such that Ny /,(2n) 
is not contained in any U € U. By the Bolzano—Weierstrass theorem, there 
exists a convergent subsequence (%n,)721, convergent to x, say. Since [a, }] is 
closed, x € [a,b]. Thus x € U, for some U € U. Since U is open, there exists 
€ > 0 such that N.(x) C U. Since ry, — x as k — oo, there exists kK EN, 
with nx > 2/e, such that |r, —2| < €/2 fork > K.Ify € Mm, (tnx) then, 
by the triangle inequality 


v— @| 29 — tag | +See — | < Le + e/2 <<, 


so that y € U. Thus Nj jn, (@nx) C U, giving a contradiction. 


A positive number 6 which satisfies the conclusions of this theorem is called 
a Lebesgue number for the cover. 


Theorem 5.4.2 (The Heine—Borel theorem for open sets) Suppose that U 
is an open cover of the (non-degenerate) closed interval [a,b]. Then there is 
a finite subcover. 


Proof We give two proofs of this fundamental theorem. Another proof is 

given in Exercise 5.4.4. First, let 6 be a Lebesgue number for the cover. We 

divide [a,b] into finitely many intervals, each of length less than 6: choose 

n €N such that n > (b—a)/6, and let aj = a+ j(b—a)/n, forO <j <n. 

Then a = ag < ay < +++ < ay = 8, and a; — aj_1 = (b— a)/n < 6. For each 

0 <j <n there exists U; € U such that Ns(a;) C U;. Then [a, b] C UF Uj. 
For the second proof, let 


C = {x € [a,b]: there is a finite subcover of [a, x]}. 


We must show that b € C. Since a € C, C is not empty. Let s = supC’. We 
take three steps. 

First, c > a. For a € U for some U € U, and there exists « > 0 such that 
N.(a) C U. Thus N,(a) [a,b] C C, so that if c = min(a+e€/2,b) thenc € C. 
Hence s>c> a. 

Secondly, s € C. For s € V for some V € U, and there exists 7 > 0 such 
that N,(s) C V. Then there exists c € CN(s—n, s]. [a, c] has a finite subcover 
W, and W U {V} is a finite subcover of [a, s]. 
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Finally, s = b. For if not, and if s < t < min(s+7,b) then WU{V}isa 
finite subcover of [a,t], so that t € C. 


A set B is said to be compact if every open cover of B has a finite subcover. 


Proposition 5.4.3 Suppose that (Un)°°, is an increasing sequence of 
open sets in R which covers a compact subset K of R. Then there exists 
nme€EN such that K C Uy). 


Proof There isa finite subcover {Up,,...,Un,}. Then kK C Un, where N = 
max{ny,...; 7}. 


Theorem 5.4.4 A non-empty subset B of R is compact if and only if it 
is closed and bounded. 


Proof Suppose first that B is closed and bounded. There exists [a, 6] such 
that B C [a,b]. Then U U {C(B)} is an open cover of [a,b], and so there is a 
finite subcover {Uj,...,Un, C(B)} of [a, 6]. Then {U1,...,Un} covers B. 

Conversely, suppose that B is compact. Let U, = (—n,n). Then (Un)?°4 
is an increasing sequence of open sets which covers B, and so, by Proposition 
5.4.3, there exists N € N such that B C Uy; B is bounded. 

Finally, we show that B is closed. Suppose that a ¢ B. We shall show that 
a ¢ B. For each n € N let 


Vn = {x ER: |x -a| > 1/n) = (-~=w,a—-1/n) U (a +-1/n, 00). 
Then (V,,)°°, is an increasing sequence of open sets which covers B, and so, by 


Proposition 5.4.3, there exists N € N such that B C Vy. Then Ny ;y(a)NB = 
~, so that a ¢g B. Thus B is closed. 


We can formulate the Heine—Borel theorem in terms of closed sets: this 
version is quite as useful as the ‘open sets’ version. We need more terminology. 
A set F of subsets of a set X has the finite intersection property if whenever 
{Fi,..., Fn} is a finite subset of F then Nj=1F; is non-empty. 


Theorem 5.4.5 (The Heine—Borel theorem for closed sets) Suppose that 
B is a bounded closed subset of R, and that F is a set of closed subsets of 
B with the finite intersection property. Then the total intersection \per¢F 
is non-empty. 


Proof This is just a matter of taking complements. Suppose that Ores F = 
0. Then {C(F) : F © F} is an open cover of B, and so there is a finite 
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subcover {C(F)),...,C(Fn)}. Thus 
BoC) WU. E) = CU Nae 1 Fa), 


so that FL N...9 F, = 0, contradicting the finite intersection property. 


Exercises 


5.4.1 Let (rp), be an enumeration of the rational numbers in (0,1), and 
let 0 < € < 1. For each n let Ip, be an open interval in (0, 1) containing 
ry and of length at most €/2”. Let U = US21In. Show that U = [0,1]. 


Suppose that U = UP? Jn, where (Jp,)°°, is a sequence of disjoint open 


intervals; let [(J;,) be the length of J,. Show that 


SIs) <e. 
n=1 


5.4.2 The set QM [0,1] is not compact. Find an open cover of QM [0,1] with 
no finite subcover. 

5.4.3 Suppose that F is a finite set of open intervals which covers the closed 
interval [a, b], and that F is minimal; no proper subset of F covers [a, b]. 
Show that F can be listed as I,,..., J, in such a way that 


aéh, inf l; <suplj_1 < inf j41 <supJ; forl<j<n, andbe In. 


Deduce that [;-1 V1; # 0 for 2 < 7 <n, and that no point of [a, }] is 
in three members of F. 

5.4.4 Suppose that Y/ is an open cover of the closed interval [co, do], and 
suppose, if possible, that there is no finite subcover. If there is no finite 
subcover of [co, (co + do)/2] set c1 = co, di = (co + do)/2; otherwise 
set ci = (co + do)/2, di = do. Show that [ci, di] has no finite subcover. 
Repeat recursively. Show that there exists b € [co, do] such that cy, — b 
and d, — basn — oo. Use this to give another proof of the Heine—Borel 
theorem. 

5.4.5 Suppose that U is an open subset of R and that x € U. Show that there 
exist rational numbers r and s such that x € N,(s) C U. Show that 
if U/ is an open cover of a subset A of R then there exists a countable 
subcover. 
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We now introduce another idea, similar enough to the notion of a closure 
point to be confusing. Suppose that A is a subset of R and that a € R. 
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A real number 0 is called a limit point, or accumulation point, of A if 
whenever € > 0 there exists a € A (which may depend upon ¢) with 0 < 
|b—a| < e. Thus bis a limit point of A if there are points of A, different from 
b, which are arbitrarily close to b. 

If a € Rand « > O then the punctured €-neighbourhood N*(a) of a is 
defined as 


Ni(a) = {x €R:0< |x — a] < €} = (a-€,a) U(a,a+ 6) = N,(a) \ {a}. 
Thus b is a limit point of A if and only if N7(b) 1 A ¥ 0, for each € > 0. 
Proposition 5.5.1 Suppose that A is a subset of R and that be R. b is 
a limtt point of A if and only if there exists a sequence (a;)72, in A \ {b} 


such that aj; > b as j > ow. 


Proof The proof is just like the proof of Proposition 5.1.1, with obvious 


modifications. 


The set of limit points of A is called the derived set of A, and is denoted 
by A’. It follows from the definitions that A’ C A. If A = {a} then A’ = 0, 
so that A need not be a subset of A’. If A = A’, we say that A is perfect. For 
example, a non-degenerate closed interval is perfect, as is a finite union of 
non-degenerate closed intervals. 

Suppose that A is a subset of R and that a € A. a is an isolated point of 
A if there exists « > 0 such that N*(a) A= 0. 


Proposition 5.5.2 Suppose that A’ is the derived set of a subset A of R, 
and that a € R. Let i(A) be the set of isolated points of A. 

(i) A is the disjoint union of A’ and i(A). 

(it) A’ is closed. 


Proof (i) Clearly A’ and i(A) are disjoint subsets of A. Suppose that a € 
A \ i(A). There are two possibilities. First, a € A. Since a is not an isolated 
point of A, it must belong to A’. Secondly, a € A \ A. There is a sequence 
(an)? in A which tends to a as n — ov. Since ay # a, for n EN, it follows 
that a € A’. 

(ii) Suppose, if possible, that b € A’ \ A’. Then b € A\ A’, and so bis an 
isolated point of A, by (i). Thus there exists « > 0 such that N*(b)N A = 0. 
Since b € A’, there exists c € A’ with |b — c| < €/2. Then Ne2(C) C N*(b), 


so that Neo 


(c) A = 9, giving a contradiction. 
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As an example, let A = {1/j : 7 € N}. Then A = AU {0} and A’ = 
{0}. Note that (A’)! = 0 # A’. This example is taken further in Exercises 
5.0.2-0.0.4. 

We now give an example of a bounded non-empty perfect set which con- 
tains no non-degenerate intervals. This set is known as Cantor’s ternary set, 
although it was first described by the Irish-born mathematician Henry Smith. 
We begin with Co = [0,1]. First, we remove the middle third of Co, to obtain 


Cy: = 0,173). 2/3, 1) = UTR: 


Ty is the left interval of Cy and Ip is the right interval. We then remove the 
middle thirds of J; and Ip to obtain 


Co = ([0,1/9] U [2/9, 1/3]) U ([2/3, 7/9] U (8/9, 1]) 
= Unt U Ir) U Ure U Ire); 


C is the union of 2? disjoint closed intervals, each of length (1/3)?; Ip, and 
Try are left intervals, and J,Rz and [pp are right intervals. We then repeat 
the process recursively, to obtain a decreasing sequence (C;,)°2.4 of closed 
sets; C’, is the union of 2” disjoint closed intervals, each of length (1/3)”, and 
each interval is either a left subinterval or a right subinterval of an interval 
of C;,-1. We then define Cantor’s ternary set C' to be N?2 Cn. 


Co SSS CC _' 
0 1 
a — 
0 1/3 2/3 1 
C; a a 
o «61/9 «(2/9 «1/3 2/3 7/9 8/9 4 
Cy ae ae ty ae ee 
C4 a — a" ae: a ened es ad ai pony 
Cs eee eee eee eee ee ee et 


Figure 5.5. Construction of Cantor’s ternary set. 


Here are some of the properties of C. 
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Theorem 5.5.3 Cantor’s ternary set C is a perfect subset of [0,1] with 
empty interior. There exists a bijection c : P(N) — C, and so C is 
uncountable. 


Proof Cis closed, and is non-empty, by the Heine—Borel theorem for closed 
sets. But in fact, the end points of all the intervals that occur in the construc- 
tion are in C.. We use this to show that C is perfect. Suppose that x € C and 
that ¢ > 0. Choose j € Zt such that (1/3)) < €/2. There exists an interval 
I; of Cj to which x belongs; both its end-points are in C,, and at least one 
of them is different from x. Thus there exists c € N*(#) MC, and so C is 
perfect. 

Further, the e-neighbourhood N,(x) is not contained in an interval of C;, 
and so N,(x) is not contained in Cj; thus N(x) is not contained in C. Since 
this holds for all « € C' and all e > 0, C has an empty interior. 

If A € P(N), let a; = 2 if j € A and let aj; = 0 otherwise. Let c,(A) = 
payen a; /34, and let c(A) = et a;/34. Then e,(A) € Cp, and e,(A) > 
c(A) as n — ov, and so c(A) € C, since C is closed. As in the proof of 
Cantor’s theorem, if A C B then c(A) < c(B), and c is injective. Conversely, 
suppose that x € C. Let 


A={néN: cis ina right-hand interval of C,,}. 


Then x = c(A). 


Cantor’s ternary set has a great deal of symmetry and self-similarity. For 
example, the mapping x — 32 is a bijective mapping of CM [0, 1/3] onto C, 
and the mapping s; defined by 


s;(x) =2+2/3) for0 <2 < 1-2/3, 
= g42/9 =1 for 1= 2/3? <@'< 1, 


is a bijective mapping of C onto itself. 
There are many constructions similar to the construction of Cantor’s 
ternary set. Suppose that € = (€)°2., is a sequence of positive numbers with 


oes XS Le Lets = Sa €;. We construct a sequence (CO), of 


closed sets ce recursively; ce = [0,1], andifn € N then oe consists of 2” 
disjoint closed subintervals of [0, 1], each of length (1 — s,,)/2". We construct 


oo by removing an open interval of length €,,/2” from the middle of each 


of these intervals. Then Co. consists of 2”+! disjoint closed subintervals of 


[0,1], each of length (1 — sy41)/2"+!. Finally we set C = nee, co. Then 
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C is a perfect subset of [0,1], with empty interior. As we shall see, C\ is 
of interest when s < 1. In such a case, we call C) a fat Cantor set. 


5.5.1 
5.0.2 


5.0.3 


5.5.4 


5.9.9 


5.5.6 


5.0.7 


5.0.8 


Exercises 


Suppose that U is an open subset of R. Show that U = U’. 

Let B= {1/j+1/k: j,k € N,k > j7}. What is B’? What is (B’)/? 
What is ((B’)’)’? 

Show that for each k € N there exists a strictly increasing sequence 
Bo Cc By cC -:- C By of subsets of R such that Bi == Bey for 
1 7 sR; 

Construct a subset C of R such that, if C) = C’, and C; = Cit for 
all 7 > 2 then (Cj) 20 is a strictly decreasing sequence of non-empty 
subsets of R. 

Suppose that 0 < A < 9. If x € [0,1), let an(x) = (41 +--+ 2n)/n, 
where x = 0.4122... is the decimal expansion of x (without recurrent 
9s), and let a,(1) = 0. Let FE, = {x € [0,1] : an(x) < A}. Show that 
E,, is closed. Show that E = N°, E,, is a perfect subset of [0,1] with 
an empty interior. 

Let (Cy)? be the sequence of closed sets that appears in the con- 
struction of Cantor’s ternary set C. Suppose that x € [0,2]. Show that 
for each n € N there exist un, Un € Cy such that x = un + vn. Use the 
Bolzano—Weierstrass theorem to show that there exist u,v € C’ such 
that x = u+ v. Show that if x € [—1, 1] there exist y, z € C such that 
L=y-z. 

Suppose that A is a non-empty bounded closed subset of R. Let 


C(A) = (—oo, inf A) U (UzezT) U (sup A, +00), 


where J is a set of disjoint open intervals contained in (inf A,sup A). 

Order J by setting J < Jif sup/ < inf J. J has the intermediate 

property if whenever J and J are in J and I < J then there exists 

KeJwithI<K< J. 

(a) Show that if 7 has the intermediate property, then A is perfect. 

(b) Suppose that A is perfect and that A° = (@. Show that 7 has 
the intermediate property. Show that there is a bijection of P(N) 
onto A. 

Suppose that A is a non-empty perfect subset of [0,1] with empty 

interior. Show that there is a bijective mapping of P(N) onto A, using 

a construction as in Cantor’s theorem. (Hint: After n steps there are 
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2” closed intervals whose union contains A. For the next step, remove 
a largest possible open interval from each.) 

Deduce that there is an order-preserving bijection @ of Cantor’s 
ternary set C' onto A. 

Deduce that a non-empty perfect subset of R is uncountable. 

5.5.9 Suppose that Cis a closed subset of R, with complement U;J;, where 
the J; are disjoint open intervals. Show that C is perfect if and only 
if Ts OT = ) when I; # Ty. 

Is it possible to find a sequence of disjoint non-degenerate closed 
intervals whose union is (0, 1)? 

5.5.10 If A is a subset of R then a point a of R is a condensation point of 
A if N.(a) 9 A is uncountable, for every « > 0. Show that if A is 
uncountable, then the set C' of condensation points of A is closed, and 
A \ C is countable. Show that C is the set of condensation points of 
itself. 

5.5.11 Show that every point of Cantor’s ternary set C’ is a condensation 
point of C. 


6 
Continuity 


6.1 Limits and convergence of functions 


So far we have considered the limits of sequences of real numbers. These 
sequences are real-valued functions defined on Z* or N. We now consider 
real-valued functions defined on a non-empty subset A of R. It is useful to 
make definitions for a general set A, but the reader should have in mind 
examples such as an open interval, a closed interval, the set Q of rational 
numbers and the set {1/n:n€ N}. 

The notion of limit extends naturally to this setting. Suppose that f : 
A — Risa function, that b is a limit point of A (which may or may not be 
an element of A) and that 1 © R. We say that f(x) converges to |, or tends 
to l, as x to b if whenever € > 0 there exists 6 > 0 (which usually depends 
on €) such that |f(x) —1| < € for those x € A for which 0 < |x — | < 6 
(that is, for « € N¥(b) MA). That is to say, as x gets close to b, f(x) gets 
close to |. We say that I is the limit of f as x tends to b, write ‘f(x) — las 
x — 0’ and also write | = lim,_., f(x). Note that in the case where b € A, 
we do not consider the value of f(b), but only the values of f at points 
nearby. 


b-5 b b+8 


Figure 6.la. Convergence of functions. 
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We now have the following elementary results, which correspond exactly 
to Propositions 3.2.2, 3.2.3 and 3.2.6, together with Theorem 3.2.5.We say 
that f is bounded on A if the image set f(A) is a bounded set. 


Theorem 6.1.1 Suppose that f, g and h are real-valued functions on a 
subset A of R and that b is a limit point of A. 

(i) If f(x) ~l asa —b and f(x) ~ mas x > b, thenl =m. 

(ii) If f(x) + 1 as x > b then there exists 6 > 0 such that f is bounded 
on N#(b) NA. 

(iii) If f(z) =1 for alla € A, then f(z) ~lasx—b. 

(iv) If f(x) + 0 as x — b, and g(x) is bounded on NZ(b) NA for some 
6 >0, then f(x)g(x) > 0 as x > b. 

(v) If f(x) — 1 and g(x) — m as x — b then f(x) + g(x) ~1+™m as 
x— bz 

(vi) If f(x) las a—>bandceR thencf(x) > cd asx — b. 

(vit) If f(a) > 1 and g(x) > m as x > b then f(x)g(x) > lm as x = b. 

(vidi) If f(x) £0 forx € A, 140 and f(x) —l asx —b then 1/f(x) 
1/l asx — b. 

(ix) If f(z) > 1 and g(x) — m as « = Bb, and if f(x) < g(x) for all 
x € N3(b)NA for some 6 > 0, then l <m. 

(x) (The sandwich principle) Suppose that f(x) < g(a) < h(a) for all 
x € N3(b) NA, for some 6 > 0, and that f(x) > | and h(x) - 1 
Then g(x) > l asx — b. 


as x — 0b. 


Proof Since the definition of limit is so similar to the limit of a sequence, 
the proofs are simple modifications of the proofs of corresponding results for 
sequences, established in Section 3.1. The details are left as exercises for the 
reader. 


Note that in several cases we have restricted attention to the behaviour of 
f inaset N(b) A. This is clearly appropriate, since we are only concerned 
with the behaviour of f as x approaches b. 

It is a useful fact that we can characterize convergence in terms of 
convergent sequences. 


Proposition 6.1.2 Suppose that f is a real-valued function on a subset A 
of R, that b is a limit point of A and thatl € R. Then f(x) > l asx — b 
if and only if whenever (an)° 9 is a sequence in A \ {b} which tends to b as 
n— oo then f(an) ~l asn— co. 


Proof Suppose that f(z) — las x — b and that (a;,)°2, is a sequence in 
A \ {b} which tends to b as n — oo. Given € > 0, there exists 6 > 0 such 
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that if ¢ € Ns(b) 1 A then |f(x) — | < ¢. There then exists no such that 
lan, — b| < 6 for n > ng. Then |f(an) —1| < € for n > no, so that f(an) - 1 
as N — 00. 

Suppose that f(x) does not converge to | as z — b. Then there exists « > 0 
for which we can find no suitable 6 > 0. If n € N then 1/n is not suitable, 
and so there exists rp, € NF in (8) NA with |f(¢n) —1| > «. Then tr, — x as 
n — oo and f(a,,) does not converge to 1 as n > oo. 


We have the following general principle of convergence. 


Theorem 6.1.3 Suppose that f is a real-valued function on a subset A 
of R, that b is a limit point of A and that | € R. Then the following are 
equivalent. 


(i) There exists | such that f(x) > 1 as x > b. 
(ii) Whenever (an)°29 is a sequence in A\ {b} which tends to b as n — oo 
then (f(an))r29 is a Cauchy sequence. 


(iii) Given € > 0 there exists 6 > 0 such that if x,y € N§(b) then 
lf(z) — fy)| <e. 


Proof Suppose that (i) holds, and that(a,,)°2., is asequence in A\{b} which 
tends to b as n — oo. By Proposition 6.1.2, f(an) — 1 as n > oo. Since a 
convergent sequence is a Cauchy sequence, (ii) holds. 

Suppose that (iii) fails. Then there exists « > 0 for which for each n € N 
there exist dn, a}, € N¥j_(0) OA with |f(an) — f(@,)| 2 € Let con-1 = an 
and con, = a},, for n € N. Then cp, — b as n — ov, and (f(cn))e2p is not a 
Cauchy sequence. Thus (ii) fails: (ii) implies (iii). 

Finally suppose that (iii) holds, and that « > 0. There exists 6 > 0 such 
that if «,y € NS(b)N A then | f(x) — f(y)| < €/2. Suppose that (an)?29 is a 
sequence in A\ {b} which tends to b as n — oo. Then there exists ng such that 
dn € Nz(b) for n > no. Thus if m,n > no then |f(an) — f(am)| < €/2, and 
so (f(an)°29 is a Cauchy sequence. By the general principle of convergence, 
there exists / such that f(a,) — las n — oo, and if n > no then |f (an) —1| < 
e/2. Thus if ¢ € Ns(b) NA then | f(x) —d| < | f(x) — f(an,)|+|f(@n.) -H] < €; 
f(x) las x — b. Thus (iii) implies (i). 


We now turn to a result which corresponds to Theorem 3.2.4. First we must 
introduce the idea of one-sided convergence. Suppose that f is a real-valued 
function on A and that b € R. Let A, = AN(0, co) and let AL = AN(—co, b). 
Suppose that 6} is a limit point of A, — that is, (b,b +6) A is non-empty, 
for each 6 > 0. Then we say that f(x) tends to | as x — b from the right 
if whenever € > 0 there exists 6 > 0 such that if x © AN (b,b+ 6) then 
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| f(x)—1| < e. We then write f(x) — las a \, 6, and denote I by limz\y f(x), 
or, more briefly, by f(b+). Similarly if f(x) tends to 1 as x — b from the 
left, we denote the limit 1 by lim, 7» f(a), or f(b—). Why do we use this 
terminology? If we consider the graph of f, drawn in the usual way, the 
variable x increases from left to right, and the values that the function f 
takes increase in an upwards direction. We therefore use ‘left’ and ‘right’ for 
the variable x, and reserve words such as ‘upper’ or ‘lower’ for the values of 
the function. 


Theorem 6.1.4 Suppose that f is a real-valued increasing function on A 
and that b is a limit point of A, = AM (b,oo). If f is bounded below on Ax 
then f(x) > inf{f(y):y € Ay} as x = b from the right. 

Similar results hold for ‘convergence from the left’, and for decreasing 
functions. 


Proof Let 1 = inf{f(y) : y © Az}. Suppose that « > 0. Then / + € is not a 
lower bound for f on A, and so there exists a € Ay with f(a) </+e. Let 
6=a-—b.IfxEeAn(b,b+6) =AN(b,a) then! < f(x) < f(a) <l+e, so 
that f(z) — las x > b from the right. 


This theorem is quite as important as Theorem 3.2.4. 


Corollary 6.1.5 If b is a limit point of Ay and A_ then f(b—) < f(b+). 


Proof For sup{f(x): x2 € A_} < inf{f(x): a € At}. 


Suppose again that b is a limit point of a subset A of R, and suppose that 
f is a real-valued function which is bounded on N;(b) M A, for some 6 > 0. 
We can then define the upper and lower limits of f at b. For 0 < t < 4, let 
M(t) = sup{f(x): a € N/(b)}. Then M(t) is an increasing function on (0, 6) 
which is bounded below. By Theorem 6.1.4 it follows that M(t) converges 
to M(0+) = inf{M(s):0<s < d} ast \, 0. M(0+4) is the upper limit or 
limes superior of f at b, and is denoted by lim sup,_,, f(x). The lower limit, 
or limes inferior lim inf, ,, f(x) is defined in a similar way. 

The next theorem corresponds to Theorem 3.5.3. 


Theorem 6.1.6 Suppose that b is a limit point of a subset A of R, and 
suppose that f is a real-valued function which is bounded on N5(b) NA, for 
some 6 > 0. Then f(x) — l as « — b if and only if limsup,_,, f(z) = 
litinte f(e) = 1 


Proof Another exercise for the reader. 
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As an example, suppose that x € (0,1). If 0 < 1/(n+1) < a < I/n, 
set f(x) = n(n + 1)(a — 1/(n + 1)). Then limsup,_,9 f(z) = 1 and 
lim inf;.9 f(z) = 0. The function f does not tend to a limit as x — 0, 
but oscillates between the values 0 and 1. 


1 x 


Figure 6.1b. limsup,_,o f(a) 4 liminfz.9 f(x) = 0. 


We can also consider limits as x — +00 or as x — —oo. Suppose that A 
is a subset of R which is not bounded above, that f is a real-valued function 
on A and that / © R. Then we say that f(x) — | as x — +o if whenever 
€ > 0 there exists xo € R such that if x € A and x > xo then |f(x) —I| <e. 
Similarly, if there exists xq such that f is bounded on AN [29,0o), and we 
define M(x) = sup{f(a) : a € AN[z, 00)} for x > xo, then M(x) is a decreas- 
ing function on [x%9,00) which is bounded below; we define lim sup,._,,, f(x) 
as inf{ M(x) : x € [xo,co)}. The lower limit is defined similarly. Limits as 
x — —oo are defined in the same way. The reader should verify that all 
the results of this section, with appropriate modifications, extend without 
difficulty to these situations. 


Exercises 


6.1.1 Show that lim, .9 V1 + 2+ 2? =1. 
6.1.2 Show that(V1+2+4+ 22 — 1)/(V1+a2— /1— 2) tends to a limit as 
x — 0, and evaluate the limit. 


6.2 Orders of magnitude 


This section is a digression; it introduces some notation that is frequently 
used, though we shall use it sparingly. 

Suppose that f is a function defined on a subset A of R, and that bis a 
limit point of A. Frequently, the principal point of interest is the behaviour of 
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f near b, rather than its actual value. The O (big O) and o (little o) notation 
is used to describe the magnitude of f near b in terms of another, usually 
simpler, function g. 

Suppose that g is another real-valued function on A. We write 


f(x) = O(g(a)) as x > b 


if there exists 6 > Oand M € Rsuch that |f(a)| < M|g(«)| for € NS(b)NA. 
Suppose that there exists 6 > 0 such that g(x) # 0 for « € N3(b) Nn A. 
Then we write 


f(x) = o(g(@)) as x — b 
if f(x)/g(x) — 0 as x — band write 


f(a) ~ g(#) asa — b 


if f(x)/g(z) ~lasx—b. 

We use the same notation when x — oo; thus f(x) = o(g(x)) as x — ov if 
f(x)/g(a) — 0 as x > oo. As a particular example, if (a,)?P2, and (b,)°°, 
are real-valued sequences then a, ~ by, if an/byn — 1 as n > ov. 

For example, suppose that p(x) = a9 + aya +--+: + a,x” is a polynomial 
function of degree n on R (with a, 4 0). Then 


This notation arose in analytic number theory, where a complicated expres- 
sion f is approximated by a simpler function g, and the interest lies in 
estimating the magnitude of the difference. Thus it might be shown that 
f(x)—g(x) = O(h(x)) as x — oo; in this case we write f(x) = g(x) +O(hA(z)). 
For example, if p is the polynomial above, then 


p(x) = nx” + O(2"1) = anx” + 0(x”) as x > 00, 
and p(x) = a9 + aya + O(2?) = ap + 0(1) as x > 0. 


Although this notation is expressive, its use requires care; in practice, it 
is frequently advisable to expand any statement involving it into a more 
standard form. 
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6.3 Continuity 


We now introduce the fundamental concept of continuity. Suppose that f 
is a real-valued function defined on a subset A of R, and that a € A. f is 
continuous at a if whenever € > 0 there exists 6 > 0 (which usually depends 
on €) such that | f(x) — f(a)| < € for those x € A for which |x — a| < 6 (that 
is for x € Ns(a) A). That is to say, as x gets close to a, f(x) gets close to 
f(a). If f is not continuous at a, we say that f has a discontinuity at a. 

Compare this definition with the definition of convergence. First, a must 
be an element of A, so that f(a) is defined. Secondly, a need not be a limit 
point of a. If it is, then f is continuous at a if and only if f(x) — f(a) as 
xz > a. If ais not a limit point, then it is an isolated point of A. In this case, 
there exists 6 > 0 such that N5(a) A = {a}, so that if x € Ns(a) 1 A then 
f(x) = f(a), and f is continuous at a; functions are always continuous at 
isolated points. 

We now have the following elementary results, which correspond exactly 
to Theorem 6.1.1. 


Theorem 6.3.1 Suppose that f, g and h are real-valued functions on a 
subset A of R and thataeé A. 


(i) If f 1s continuous at a then there exists 6 > 0 such that f is bounded 
on Ns(a) NA. 
(it) If f(a) =1 for alla € A, then f is continuous at a. 
(ii) If f(a) = 0, f is continuous at a, and g(x) is bounded on Ns(a)N A 
for some 6 > 0, then fg is continuous at a. 
(iv) If f and g are continuous at a then f +g is continuous at a. 
(v) If f and g are continuous at a then fg is continuous at a. 
(vi) If f(z) # 0 for « € A, and if f is continuous at a then 1/f is 
continuous at a. 
(vii) (The sandwich principle) Suppose that f(x) < g(x) < h(x) for all 
x € Ns(a) NA, for some 6 > 0, that f(a) = g(a) = h(a) and that f 
at a. 


and h are continuous ata. Then g is continuous at a 


Proof These results follow directly from Theorem 6.1.1, and the remarks 
above. Oo 


Similarly, we have the following consequence of Proposition 6.1.2. 


Proposition 6.3.2 Suppose that f is a real-valued function on a subset 
A of R and thata € A. Then f is continuous at a if and only if whenever 
An +a asn— oo then f(an) > f(a) asn— oo. 
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Continuity behaves well under composition. 


Theorem 6.3.3 Suppose that f is a real-valued function on a subset A of 
R and that g is a real-valued function on a subset B of R which contains 
f(A). If f is continuous ata € A and g is continuous at f(a), then go f is 
continuous at a. 


Proof Suppose that « > 0. Then there exists 7 > 0 such that if 6 € B and 
|b — f(a)| < then |g(b) — g(f(a))| < e. Similarly there exists 6 > 0 such 
that if a’ € A and ja’ — a| < 6 then |f(a’) — f(a)| < 7. Thus if a’ € A and 
|a’ — a| <6 then |g(f(a’)) — g(f(a))| <e. 


The proof is trivial: the theoretical importance and practical usefulness 
are enormous. 

Continuity is a local phenomenon. Nevertheless, there are many important 
cases where f is continuous at every point of A. In this case we say that f is 
continuous on A, or more simply, that f is continuous. Continuity on A can 
be characterized in terms of open sets, and in terms of closed sets. 


Proposition 6.3.4 Suppose that f is a real-valued function on a subset A 
of R. The following are equivalent: 
(i) f is continuous on A; 
(ii) if U is an open subset of R. then f—'(U) is a relatively open subset 
of A; 
(it) for each c € R the sets Up = {1 € A: f(x) > ch and LL = {aE A: 
f(x) < c} are relatively open in A; 
(iv) if F is a closed subset of R then f—'(F) is a relatively closed subset of 
A; 
(v) for each cE R the sets Fe = {x € A: f(x) > ch andG- = {xe A: 
f(x) < c} are relatively closed in A. 


Proof Suppose that f is continuous on A, that U is an open subset of R and 
that x € f~!(U). Since U is open, there exists € > 0 such that N-(f(ax)) CU. 
Since f is continuous at 2, there exists 6 > 0 such that if y € Ns(a)N A 
then |f(y) — f(z)| < «. Thus Ns(z) N A C f7!(U), and so f7!(U) is a 
relatively open subset of A. Thus (i) implies (ii). Since Ue = f~'((e, 00)) and 
L. = f~+((—ox, c)), and (c, 00) and (—oo, c) are open, (ii) implies (iii). 
Suppose that (iii) holds. Suppose that x € A and that « > 0. Then the 
sets Uy) and Lyz)4¢ are relatively open in A, and z is in each of them. 
Thus U¢(z)—. 1 L(x) +e 18 relatively open, and there exists 6 > 0 such that 
N5(z) NA C Ups aye NL ga) +e; thus if y € Ng(x) A then f(y) > f(x) —€ 
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and f(y) < f(x) +6, so that |f(y) — f(x)| < e. Thus f is continuous at 2. 
Since this holds for all x € A, (iii) implies (i). 

Finally, the equivalences of (ii) and (iv), and of (iii) and (v), follow by 
considering complements. 


It is important that these conditions involve inverse images of open and 
closed sets. Here is a simple example to show that similar results need not 
hold for direct images. Let f(x) = 1/(1 +27). Then f is continuous on R, R 
is both open and closed, and f(R) = (0, 1], which is neither open nor closed. 
The continuous image of an open set need not be open, and the continuous 
image of a closed set need not be closed. 

We now consider some simple examples of continuous real-valued func- 
tions, and of discontinuities, which will enable us to introduce some more 
ideas. 


1. Take A = R, and set i(x) = x. Then 7 is continuous on R;; if |x — a] < € 
then |i(2)—i(a)| < €,so that we can take 6 = e, for each x € R. Combining 
this with the results of Theorem 6.3.1, we see that all polynomial functions 
on R are continuous. 

2. The exponential function is continuous on R. First, note that if |h| < 1 
then ; : 

h he oh t.. 
jer =I = Re ger ae | = JAIL + 5 + 5g ++) = 21h. 
Suppose that a € R and € > 0. Let 6 = €/2e%. If |x — a] < 6 then 


|e” — e*| = je*e** — e*| = e* |e” * — 1| < 2|x — ale® < 2de* = €. 


3. Take A = R, and set f(x) = xz if e # 0 and set f(0) = 1. Then f is 
continuous at every point of R except 0. The discontinuity at 0 is the 
simplest sort of discontinuity; if we change the value at 0 to 0, we remove 
the discontinuity. More generally, a real-valued function f on A has a 
removable discontinuity at a if f(a) > las x — a, and! ¥ f(a). If we 
redefine f(a) as 1, then the discontinuity disappears. 

4. Suppose that f is areal-valued function on asubset A of R, and that a € A. 
We say that f is continuous on the right at a if whenever € > 0 there exists 
56 > 0 (which usually depends on €) such that | f(a) — f(a)| < € for those 
zé€ Awitha <x <a+0. Continuity on the left is defined in a similar 
way. f is continuous on the right if and only if either f(a+) = limz\a f(z) 
exists and is equal to f(a), or there exists 6 > 0 such that (a,a+6)NA = 0. 
We say that f has a jump discontinuity at a if one of the following cases 
holds: 
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(i) f(a—) and f(a+) both exist and are different, and f(a) € 
[f(a—), f(a+)] — in this case we have a jump of (positive or negative) 
size f(a+) — f(a—); 

(ii) f(a—) exists and is different from f(a), and f is continuous on the 
right at a — in this case we have a jump of (positive or negative) size 
f(a) - f(a-): 

(iii) f(a+) exists and is different from f(a), and f is continuous on the 
left at a — in this case we have a jump of (positive or negative) size 


f(a+) — f(a). 
[We give this cumbersome definition to allow for the possibility that 
AN (a,a+6) or AN (a—6,a) may be empty for some 6 > 0.] 


sans Lee 


1 f@ 


f(a) 
a ne x 


xX=a 


Figure 6.3. A jump discontinuity. 


Theorem 6.3.5 The only discontinuities of a monotonic function are 
jump discontinuities, and the set of discontinuities is countable. 


Proof The first statement follows from Theorem 6.1.4. For the second, 
let D be the set of discontinuities of f. If d € D, let i(d) = (f(d—), f(d+)) 
[in case (i) above] or ((f(d—), f(d)) [in case (ii)], or ((f(d), f(d+)) [in case 
(iii)]. Then the open intervals {i(d) : d € D} are disjoint, and their union 
is open, and so D is countable, by Theorem 5.3.3. 


5. Suppose that A is a subset of R. Let I, be the indicator function of A: 
Ia(z) = life € Aand I4(x) = Oif x ¢ A. If x © A® then there exists 
6 > 0 such that Ns(x) C A, and then I4(y) = Ia(x) = 1 for y € N5(z). 
Thus J, is continuous at each point of A®°. Similarly, [4 is continuous at 
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each point of (C(A))° = C(A). What happens if x € 0A? If x € OA and 
6 > 0 then there exit y € Ns(x) M A and z € N;(x) M C(A), so that 
Ta(y) = 1 and I,4(z) = 0. Thus J, is not continuous at x. 

For example, the indicator function of Cantor’s ternary set is discontinuous 
at points of C, and continuous at points of the complement of C. The 
indicator function of the rationals has no points of continuity, since OQ = 
R. 

6. Let f be the saw-tooth function 


ee {2} for 2k < a4 < 2k+1, 
~ ll -{a} for 2k+1< 2 < 2k+2, 


for k € Z. Let g(x) = f(1/x) for x ¥ 0, and let g(0) = 0. Then g has a 
discontinuity at 0: g(a) oscillates in value between 0 and 1 as x — 0. 


These examples by no means exhaust the ways in which a real-valued 
function can be discontinuous. 

We have seen that the continuous image of a closed set need not be closed. 
The situation is different for bounded closed sets. We now use the Bolzano— 
Weierstrass theorem to obtain some results of fundamental importance. 


Theorem 6.3.6 Suppose that f is a continuous real-valued function on a 
non-empty bounded closed subset A of R. The image f(A) = {f(x): a € A} 
is a bounded and closed subset of R. In particular, f attains its bounds: there 
exist y,z € A such that f(y) = sup{f(xz) : x € A} and f(z) = inf{f(z) : 
ze A}. 


Proof First, suppose, if possible, that f is not bounded. Then for each n € N 
there exists a, € A with | f(a,)| >. By the Bolzano—Weierstrass theorem 
there exists a subsequence (an,)?2., which converges to an element a € A as 
k — oo. Since f is continuous, f(an,) — f(a) as k — oo (Proposition 6.3.2), 
and so (f(dn,))?2, is bounded, giving a contradiction. 


oo 
n=1 


in A such that f(a) — bas n — oo. By the Bolzano—Weierstrass theorem 


Secondly, suppose that b € f(A). Then there exists a sequence (a,,) 


there exists a subsequence (an,)?°., which converges to an element a € A as 
k — oo. Since f is continuous, f(an,) — f(a) as k > oo (Proposition 6.3.2). 
But f(an,) — b as k > oo, and so b= f(a) € f(A). Thus f(A) = f(A), and 
f (A) is closed. 


Suppose that f is a real-valued function defined on an interval J and that 
a is an interior point of J. f has a local maximum at a if there exists 6 > 0 
such that (a—6,a+06) CI and f(x) < f(a) for all x € (a—6,a+0). A local 


minimum is defined similarly. 


158 Continuity 


Corollary 6.3.7 Suppose that f is a continuous real valued function on 
an interval I which has no local maximum or local minimum. Then f is a 
monotonic function on I. 


Proof Suppose that f is not monotonic. Then there exist a << d < bin I 
such that either f(a) < f(d) > f(b) or f(a) > f(d) < f(b). Consider the 
restriction of f to [a,b]. In the former case, f attains its supremum at a point 
c of [a, b]. Since f(c) > f(d) > max(f(a), f(6)), cis an interior point of [a, bj, 
and cis a local maximum of f. In the second case, f has a local minimum in 


[a, b]; the proof is exactly similar. 


Theorem 6.3.8 Suppose that f is an injective continuous real-valued 
function on a non-empty bounded closed subset A of R. Then the inverse 
mapping f-': f(A) = A is continuous. 


Proof Let h = f~'. If F is a closed subset of R, then h7!(F) = f(F'9 A), 
which is closed in R, by Theorem 6.3.6, and is therefore closed in f(A). Thus 
h is continuous, by Theorem 6.3.4. 


If f is a continuous real-valued function on a set A, and €« > 0, then for 
each x € A there exists a 6 > 0 such that f(Ns(z) M A) C N-(f(zx)). In 
general, the value of 6 depends on x. To take a very easy example, consider 
the continuous real-valued function f(a) = x? on R. Then if e > 0 and x > 0 
then 


(w+ €/2x)? = 2? +64 €?/4a? > a? +, 


and so 6 must be smaller than ¢/2z. Thus it is not possible to find a single 
6 > 0 that will work for all x. There are however important cases where for 
each € > 0 a single 6 will do. This merits a definition. Suppose that f is a 
real-valued function defined on a subset A of R. f is uniformly continuous 
on A if whenever € > 0 there exists 6 > 0 (which usually depends on e) such 
that if z,y € A and |x — y| < 6 then |f(x) — f(y)| <.e. 


Theorem 6.3.9 Suppose that f is a continuous real-valued function on a 
non-empty bounded closed subset A of R. Then f is uniformly continuous 
on A. 


Proof Suppose not. Then there exists € > 0 for which we can find no suitable 
6 > 0. Thus for each n € N there exist elements a, and b, in A with 
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|a@n—bn| < 1/n and |f(an)—f(bn)| = €. By the Bolzano—Weierstrass theorem 
there exists a subsequence (dn,)?2., which converges to an element a € A as 
k — oo. Since an, — bn, — 0 as k — oo, bn, — a, as well. Since f is 
continuous at a, f(an,) — f(a) and f(bn,) — f(a) as k > ov, so that 
f(an,) — f(bn,) 2 0 as k > oo. As |f(an,) — f(bn,)| > € for all k € N, we 
have a contradiction. 


We have seen that it is not always possible to exchange limiting procedures 
when we consider a double sequence. Similar phenomena occur when we 
consider a sequence of functions of a real variable, or a function of two real 
variables. For example, let f(a, y) = e~*/Y for « > 0,y > 0. Then 


lim (tim Fe.y)) = lim1l=1 
yoo 00 


wL—-CO 


lim (tim (x,y)) == lim’) =0: 
yoo \%— 00 yoo 

Similarly, let f,(a) = x”, for x € [0,1] and n € N. Then each function f, 
is continuous on [0,1]. Let f(z) = 0 in 0 < x < 1 and let f(1) = 1. Then 
fn(x) — f(x) for each x € [0,1], and e is not continuous at 1. 

There is however one easy and important case, where limits are taken of 
increasing functions or sequences, and sums are taken of positive elements. 
We shall prove just one case, which we shall need later. 


Theorem 6.3.10 Suppose that f,(x) is a sequence of non-negative 
increasing functions on an interval [a,b], each of which is continuous on 
the left at b. Then 


> fn(b) = lim (> nt) 
n=1 n=1 


(Here the sums and limit can be finite or infinite.) 


Proof Tf \7?°., fn(c) = co for some c € [a,b), then °°, fn(x) = oo for 
x € [c, bj, and the result holds. Otherwise, the mapping x > S°??_, fn(2) is 
increasing, and so 


sup Ss Ae) = lim fn(2). 
n=1 


as@r<b 4 
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Hence 


6.3.1 


6.3.2 


6.3.3 


6.3.4 


6.3.5 


Continuity 


Y— fn(b) = sup > fn(b) 
n=1 
sup ( sup ¥ ste) 


meN agr<b 4 


= sup [oe 5 ft) 


axa<b \meN Hat 


= sup S- fn() = lim | fn(2)- 
n=1 


asa<b 4 


Exercises 


A real-valued function on R is periodic if there exists a non-zero num- 
ber t such that f(a +t) = f(a) for all e € R. Suppose that f is 
periodic. Show that 


{tER: f(a+t) = f(z) for alla € R} 


is asubgroup of R. Show that if f is continuous and not constant, then 
it is a proper closed subgroup of R, and that there is a least positive 
t such that f(a +t) = f(x) for all x € R; t is the period of f. 

Show that the exponential function is strictly increasing on R, that 
it is not uniformly continuous on R, but that its restriction to any 
semi-infinite interval (—oo, A] is uniformly continuous. 

Define a real-valued function f on (0,1) as follows. If r is rational 
and r = p/q in lowest terms then f(r) = 1/q; if x is irrational, then 
f(x) = 0. Show that f is continuous at every irrational point of (0, 1), 
and that f is discontinuous at every rational point of (0,1). 

Suppose that f and g are continuous real-valued functions on A and 
that h(a) = max(f (x), 9(x)), for 2 € A. Show that h is continuous. 
Give an example of a sequence (f;)?°.9 of non-negative continuous 
real-valued functions on [0, 1] for which infnen fn is not continuous. 
Suppose that A is a non-empty subset of R. Let d4(x) = inf{|x — a]: 
aé€ A},forxeR. 

(a) Show that |d4(x) — da(y)| < 
(b) Show that {c € R: da(x) = 
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(c) Suppose that A is closed and that B is compact. Show that 
there exist a € A and b € B such that |a — b| = inf{|x — y| : 
a€é A,y € B}. 

(d) Suppose further that A and B are disjoint. Show that there exist 
disjoint open sets U and V such that AC U and BC V. 

(e) Suppose that A and C are disjoint closed subsets of R. Show that 
there exist disjoint open sets U and V such that A C U and 
CCY. 

6.3.6 Let K be a closed subset of [0,1] containing 0 and 1, and let f be 
a continuous real-valued function on AK. Extend f to [0,1] by linear 
interpolation: if « € [0,1] \ & let 1 = sup{k © kK: k < z}, let 
r=inf{ke K:k> cx}, and let 


r—-2£ x—l 


fO4 f(r). 


r—l 


f= 

Show that the extended real-valued function f is continuous on [0, 1]. 

6.3.7 Give an example of a continuous injective real-valued function f on a 
closed subset A of R. for which the inverse function is not continuous. 

6.3.8 Show that if f is a uniformly continuous real-valued function on a 
subset A of R then f extends to a uniformly continuous function g 
on A, and that the extension is unique. Give an example to show that 
the corresponding result for continuous real-valued functions is false. 

6.3.9 At the jth stage in the construction of Cantor’s ternary set C, we 
remove 2/~! intervals, each of length 1/3’. List these intervals from 
left to right as [yj,...Jg;-1,; - that is, sup(ij) < inf(i41,;) for 
1 <i < 2/-!. Define a function f on [0,1] \ C by setting f(x) = 
(2i — 1)/2) for x € J,,;. Verify that f is an increasing function on 
[0,1] \C. Set f(1) =1, andifz €¢ Cand «¥1, set f(x) = inf({f(y): 
y > x,y € [0,1] \ C}. Show that f is a continuous increasing function 
on [0,1]. This is the Cantor—Lebesgue function. 

6.3.10 Suppose that f is a real-valued function on R which satisfies f(a+y) = 
f(x) + f(y) for all x,y € R, and which is continuous at 0. Show that 
there exists 4 € R such that f(x) = Az for alla ER. 

6.3.11 Suppose that f is a continuous real-valued function on [0,1] with 
f(0) = f(1) = 0. Suppose that for each 0 < x < 1 there exists 
h>0, withO<2-h<a2<a2+h <1, such that 


f(a) = (fla —h) + fle + h))/2. 


Show that f(a) = 0 for all x € [0,1]. 
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6.3.12 If Q EN, let Q, be the set of rationals p/g in [0,1], where p and q 
are coprime. Suppose that n € N and that x € [0,1]. If there exists 
p/q € Qq such that |z—p/g| < 1/2n?, let f,(x) = (1—2n?|2—p/q|)/q; 
otherwise let f,(x) = 0. Show that f,, is a well-defined function on 
[0,1] and that f is continuous. Show that for each x € [0,1], fn(z) 
converges to f(x) as n — ov, where f is the function defined in Exer- 
cise 6.3.3. The point-wise limit of continuous functions can have a 
dense set of discontinuities. 


6.4 The intermediate value theorem 


We now consider a continuous real-valued function defined on an interval J. 
Suppose that f is continuous on J, that a,b € J, and that f(a) is negative 
and f(b) is positive. Then intuition suggests that f(c) = 0 for some point c 
in the interval [a,b]. This is indeed so, but we must prove it. The result is a 
consequence of the connectedness of the interval [a, }]. 


Theorem 6.4.1 Suppose that f is a continuous function on an interval 
I, that a,b are points of I with a < b, and that f(a) <u < f(b). Then there 
exists a<c<b such that f(c) =v. 


Proof We give two proofs. The first uses the connectedness of J. Let L = 
{x € [a,b] : f(x) < v} and let G = {x € [a,b] : f(x) > v}. Then L andG 
are disjoint non-empty relatively open subsets of I. Since [a, b] is connected, 
it follows that [a,b] 4 LUG; if c € [a,b] \ (LZ UG) then f(c) =v. 

For the second proof, we use repeated dissection, as in the second proof 
of the Bolzano—Weierstrass theorem. Set a9 = a and bb = Bb. Let do = 
(ao + bo)/2. If f(do) > v, we set ay = ao and b; = do. Otherwise, 
f(do) < v, and we set aj = do and bi = bo. Thus bi — ay = (bo — ao)/2, 
and f(a,) < uv < f(b1). We now iterate this procedure recursively. At the 
jth step, we obtain a closed interval [a;,b;] contained in [a;—1,b;-1] with 
b; —a; = (bo — a0) /2/, and with f(a;) < v < f(b;). Then the sequence (aj) 20 
is increasing, the sequence (05) F20 is decreasing, and both converge to a com- 
mon limit c. Then f(c) = lim; f(a) < v and f(c) = limy soo f(by) > v, 
so that f(c) =. 


Of course, a similar result holds if f(a) > f(0). 


Corollary 6.4.2 If f is a continuous function on an interval I then f (I) 
is an interval. 


Corollary 6.4.3 If f is a continuous strictly monotonic function on an 
open interval I then f(I) is an open interval. 
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Proof If I is open and x € J, there exist a,b € I with a < x < b, so that 


f(x) € (f(a), f(0)) © f(D; f(D) is open. 


Proposition 6.4.4 If f is a continuous function on an interval I then f 
is injective if and only if f is strictly monotonic. 


Proof If f is strictly monotonic, then certainly f is injective. Suppose that 
f is not strictly monotonic, and suppose for example that a < b < c while 
f(a) < f(c) < f(b). Then there exists d € [a,b] such that f(d) = f(c), 
contradicting the fact that f is injective. Other possibilities are dealt with in 
the same way. 


Proposition 6.4.5 If f is a strictly monotonic function on an interval I 
then f-!: f(1) 3 I is continuous. 


Proof Suppose without loss of generality that f is strictly increasing. Sup- 
pose that b € f(I) and that « > 0. Suppose that a = f~!(b) is an interior 
point of J. There exist c,d € I witha-—e<c<a<d<a-+e. Then 
f(c) < f(a) =b < f(d); let 6 = min(b — f(c), f(d) — b). If |y — b| < 6, then 
fle) < y < f(a), so that c < f-l(y) < d, and |f—'(y) — f-1(6)| < «. The 
case where a is an end-point of J is left to the reader. 


Note that in this last proposition we do not require f to be continuous. 
We can now establish the existence of nth roots of positive numbers 
without the need for any subsidiary calculations. 


Corollary 6.4.6 Ifa>Oandk€N then there exists a unique y > 0 such 
that y" =a. Let y = a'/". The the mapping a — a'/" : (0,00) > (0,00) is 
continuous. 


Proof The function f(x) = x” is a strictly increasing continuous function 
on (0,00), so that f((0,00)) is an interval. Since f(z) — 0 as  — 0 and 
f(x) — was x > 00, f((0,00)) = (0,00). Thus f~! is a continuous bijection 
of (0,00) onto (0,00). If a € (0,00) then y = f~1(a) is the unique positive 
nth root of a. 


We also have the following. 


Proposition 6.4.7 Suppose that p is a real polynomial of odd degree n. 
Then there exists x € R with p(x) =0. 


Proof Without loss of generality, we can suppose that p is monic, so that p = 
x" +an_ 0" !+---+ag. We shall show that p takes both positive and negative 
values. If x 4 0 then p(x) = 2"(14+ q(x)), where q(x) = an_1/4+-+-+a0/x”. 
Since a; /x-7 — O0as xz — ooandas x — —oo, for0 < j < n—1, there exists 
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R > Osuch that |q(x)| < 1/2 for |z| > R. Then 1+ q(x) > 1/2 for |x| > R, so 
that p(—R) < —R"/2 < 0 and p(R) > R"/2 > 0. By the intermediate value 
theorem there exists « € [—R, R] for which p(x) = 0. 


We have the following fixed-point theorem. 


Theorem 6.4.8 Suppose that [a,b] is a closed bounded interval and that 
f : [a,b] > [a, 6] is continuous. Then there exists c € [a,b] with f(c) =. 

Proof If f(a) = a or if f(b) = b, there is nothing to prove. Otherwise, 
let g(x) = x — f(x). Then g(a) = a— f(a) < 0 and g(b) =b— f(b) >0. 
By the intermediate value theorem, there exists c € [a,b] with g(c) = 


c— f(c) =0. 


Exercises 


6.4.1 Suppose that 0 <a < b. Find limp_..(a” + 6") /". 

6.4.2 Show that n!/" 1 as n — oo. 

6.4.3 Does (n!)!/" converge, as n > 00? 

6.4.4 Give an example of a continuous bijective map of (0, 1) onto itself with 
no fixed point. 

6.4.5 Let f(a) = x for x rational and f(x) = 1-2 for f irrational. Show that 
f is a bijection of [0,1] onto itself, and that f has exactly one point of 
continuity. Can you find a bijection of [0,1] onto itself with no points 
of continuity? 

6.4.6 Suppose that f is a continuous periodic function on R, and that t > 0. 
Show that there exists 2 € R with f(x) = $(f(a@+t) + f(x—-?)). 

6.4.7 Suppose that f(x) is a continuous function on [0,1] with f(0) = f(1). 
(a) Use the intermediate value theorem to show that there exists 0 < 

x < 1/2 with f(x) = f(x +1/2). 

(b) Suppose that n € N and that n > 1. By considering the sequence 
(f((j — 1)/n) — f(/n))F=1 show that there exists 0 <2 <1—1/n 
such that f(x) = f(a+1/n). 

(c) Suppose that 0 < A < 1 and that 1/, is not an integer. Let h(x) = 
f(2/A)x — f(2x/X), where f is the saw-tooth function of Section 
6.3. Show that there exists no x € [0,1 — A] with h(a) = h(a+A). 


6.5 Point-wise convergence and uniform convergence 


Suppose that (f;,)°2, is a sequence of real-valued functions on a set S, and 
that f is another such function. The sequence (fn)?2.1 converges point-wise 
to f if fn(s) — f(s) for each s € S. More formally, 
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e the sequence (fn)72, converges point-wise to f if for each s € S and each 
€ > 0 there exists no € N such that | f(s) — f(s)| < € for all n > no. 


Note that the choice of ng depends on both ¢€ and s. This is a very natural 
idea to consider, but it turns out that point-wise convergence is too weak for 
many purposes, and is awkward to work with. A stronger, and more tractable 
notion is that of uniform convergence. Here the number no depends only on 
€: the same value works for all s € S. Formally, 


e the sequence (f;,)°°, converges uniformly to f on S if for each € > 0 there 
exists no € N such that | fn(s) — f(s)| < € for all n > np and all s € S. 


Let 
2nx for0 <a <1/2n, 
fale) S42 2ne. “for lane < 1/1, 
0 otherwise. 


Then f,(0) = 0, and if0 < x < 1 then f,(x) = O if n > 1/z, so that 
fn converges point-wise to 0 on (0, 
fn(1/2n) = 1, forn EN. 

Uniform convergence is particularly useful when we consider continu- 


1]. It does not converge uniformly, since 


ity. Here is the fundamental result connecting continuity and uniform 
convergence: it is very easy, but very important. 


Theorem 6.5.1 Suppose that (fn)°, is a sequence of continuous real- 
valued functions defined on a subset A of R and that fp, converges uniformly 
to a function f, asn — oo. Then f is continuous on A. 


Proof Suppose that z € A and that « > 0. Then there exists no € N 
such that |f,(z) — f(z)| < €/3 for n > no and for all z € A. Since f,, is 
continuous at zo, there exists 6 > 0 such that if z € A and |z — z| < 6 then 
| fno(Z) — fing (Z0)| < €/3. For such z, 


If(z) — Fzo)| SIF) — fro (2) + | Fro (2) = Fro (20)| + | Fro (20) — F(20)| 
<¢/3+6€/3+€/3 =e. 


Let fn(x) = x” for x € [0,1]. Then f,(x) — 0 for0 < x < 1,and fp(1) = 1, 
so that f, converges point-wise on [0,1] to a discontinuous function; the 
point-wise limit of continuous functions need not be continuous. 

An infinite series )°7° 9 fn of real-valued functions on a set S converges 
point-wise, or uniformly, if the sequence of partial sums does. It is said to 
converge absolutely uniformly if \~°° 5 |fn| converges uniformly. 
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Proposition 6.5.2 If an infinite series \~>- 9 fn of real-valued functions 


on a set S converges absolutely uniformly, then it converges uniformly. 


Proof For each s € S, S>°°.9 fn(s) converges absolutely, and therefore con- 
verges to t(s), say. Suppose that « > 0. Then there exists no such that 
jem+1 lf(s)| < 6 for no < m < nand for all s € S. If s € S and m > no 
then 


[do Fils) — (5) = lim | $7 fils) — D2 Fils)! 
j=0 j=0 j=0 


n 


= lim | >) fils) < lim, >) fils) <e. 


Since this holds for all s € S, )°>°9 fn converges uniformly to t. 


Here is a simple test for absolute uniform convergence. 


(oe) 


Proposition 6.5.3 (Weierstrass’ uniform M test) Suppose that 77-9 fn 
is an infinite series of real-valued functions on a set S, and that (Mn)?29 is 
a sequence in R* for which |fn(s)| < Mn for alls € S and alln € Z*. If 
Pg Mn < 00, then 7°°9 fn converges absolutely uniformly. 


Proof An easy exercise. 


We shall consider uniform convergence in a more general setting in 
Volume IT. 


Exercises 


6.5.1 Let (rp)°2, be an enumeration of the rationals in [0,1], with ro = 0, 
ry =1. Ifa € [0,1], let fo(x) = 2, let fi(x) = 1-2 and let 


_ fa/rs LOS eS Tr, 
fala) = a =) Ch ney * AE fe eae el 


for k > 1. Let gn(a) = yo (fe(x))"/2*, for n EN. 

(a) Show that the sum converges uniformly on [0,1], so that gy, is a 
continuous function on [0, 1]. 

(b) Show that gn(rz) — 2rg as n — oo, for each k € Zt, and that 
9n(y) > 0 as n > ©, for each irrational y in (0, 1]. 

(c) Let h(x) = limn—soo gn(x). Show that H is discontinuous at the 
rational points of [0,1] and is continuous at the irrational points. 
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6.5.2 Construct a sequence (fn)?29 of continuous functions such that 
yn —0 |fn| converges point-wise and )>?°_, fn converges uniformly, but 
not absolutely uniformly. 

6.5.3 Prove Weierstrass’ uniform M test. 

6.5.4 Dirichlet’s test for uniform convergence. Suppose that (fi) Fo is a 
decreasing sequence of non-negative real-valued functions on a set 
S which converges uniformly to 0 and that (25) Fo is a sequence 
of real-valued functions on S for which the sequence of partial 
sums (=o 2j)p29 is uniformly bounded: there exists M such that 
| eyo 23(s)| < M for all n € Z* and all s € S. Use Abel’s formula to 
show that Lo a;z; converges uniformly. 


6.6 More on power series 


We now consider the continuity of functions defined by power series. These 
are complex-valued functions, and we need to introduce the notion of the 
continuity of a complex-valued function of a complex variable. The definition 
is essentially the same as the definition of continuity of a real-valued function 
of a real variable. Suppose that f is a complex-valued function defined on a 
subset A of C, and that zo € A. Then f is continuous at zo if whenever « > 0 
there exists 6 > 0 such that if z € A and |z — zo| < 6 then |f(z) — f(z)| < 
e. f is continuous on A if it is continuous at each point of A. (Of course, 
a real-valued function defined on a subset A of R can be considered as a 
complex-valued function, and A can be considered as a subset of C: the 
two definitions of continuity are then trivially the same.) The reader should 
convince himself or herself that, except for the sandwich principle, which 
has no obvious analogue, the statements for complex-valued functions of a 
complex variable which correspond to the statements of Theorems 6.3.1 and 
6.3.3 and Proposition 6.3.2 are also true. In particular, polynomial functions 
on C are continuous on C. 

There is also a complex version of Theorem 6.5.1. The proof is the same 
as in the real case. 


Theorem 6.6.1 Suppose that (f,)°, is a sequence of continuous real- 
valued functions defined on a subset A of R and that fn converges uniformly 
to a function f, asn — oo. Then f is continuous on A. 


Complex versions of the Weierstrass M test and Dirichlet’s test also hold 
(Exercises 6.5.3 and 6.5.4). 

Suppose that $°°° 9 anz" is a complex power series with non-zero radius 
of convergence R. If |z| < R, let f(z) = 772.9 anz”. 
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Theorem 6.6.2 Suppose that )-°° 9) anz" is a complex power series with 
radius of convergence R. If r < R then oP 4anz” converges absolutely 
uniformly on {z:|z| <r} and the function f(z) = 772.9 Gnz" onz:|2z2|<R 
is continuous on z:|z| < R. 


Proof Choose r < s < R, and let Ms = supyez+ |an|s”. Then 


= = r\r = r\n Mss 
Dlr Dla (G) PSD MG) =F 


n=0 n= 


and if |z| < r then ja,z”| < |a,|r”. Applying Weierstrass’ uniform M test 
(Exercise 6.5.3) and Theorem 6.6.1, it follows that S°?° 9 anz” converges abso- 
lutely uniformly on the set {z : |z| < r} to a function which is continuous on 
{z: |z| <r}. If |z| < R, choose r with |z| <r < R. Then f is continuous on 
the set {z : |z| <r}, and so, considered as a function on the set {z: |z| < R}, 


it is continuous at z. 


Note that the proof depends only on the convergence of a geometric series. 
This simple idea is very powerful, and we shall use it, and the convergence 
of series such as }>°° 5 n'r”, where 0 <r <1 and k € N, many times in the 
future. 

Provided that their radii of convergence are positive, different power series 
define different functions. 


Theorem 6.6.3 Suppose that the power series \-7° 9 anz” and YP 4 bnz” 
each have radius of convergence greater than or equal to R> 0. Let f(z) = 
yap One” and giz) = >) 0 Un2”, for |z| < R. Suppose that (2,)72 4 38 
a null sequence of non-zero complex numbers in {z : |z| < R} such that 
f (ze) = g(r) for allk EN. Then ay = by for alln € Z. 


Proof If not, let N be the least integer for which ay # by, Let 


N-1 ee) ee) 
NO={2)—) ae = ae" (> oun?) = 2" Fy(z), 
n=0 n=N n=0 
and let 
N-1 ee) ee) 
gn(z) = f(z) — So baz” = SO dp2” = 2% (> bene = zNGy(z). 
n=0 n=N n=0 


Then fn (ze) = gn(Zx) for all k € N, and so Fy (zz) = Gn (zx) for allk EN. 
Since F(z) = oP2.9 @n4+n2” for |z| < R, Fy is continuous at 0. So is Gy, 
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and so 
an = fn (0) = jdm, Fn (Zr) = jm Gn (Zr) = Gy (0) = by, 


giving a contradiction. 


This means that if we obtain two power series for the same function, we 
can ‘equate coefficients’. 

Suppose that that the power series )°7° 9 dz” has radius of convergence 
1. What can we say about }°°° 9 Gn? We begin with an easy result. 


Proposition 6.6.4 Suppose that the power series \yy° 4 anz" has radius 
of convergence 1. The following are equivalent. 


(i) The series \y° 9 an is absolutely convergent. 
(it) or. 9 anz” converges uniformly on D = {z : |z| <1} to a continuous 
function f on D. 
(itt) The set {377° 9 |an|a” :0 < a% < 1} is bounded. 


Proof Since |anz"| < an for z € Ly, the equivalence of (i) and (ii) follows 
from the complex version of Weierstrass’ uniform M test (Exercise 6.5.3). If 
(i) holds, then f is bounded on D, since >>? 9 |an|x” < S°P°.9 |an|, and so 
(iii) holds. Finally, suppose that (iii) holds, and that 


[oe] 
Ma sin] Yo lnlets0cecih 


n=0 


If N € N then 
N N 
S> leg| =I > lan|z” < M, 
n=0 OA tp 


so that S772 9 |an| < M, and (i) holds. 


What happens if }°°° 5 an is conditionally convergent? First the radius of 
convergence is at least 1, since the sequence (an,)?°.9 is bounded. If it were 
greater than 1, then }°°° 9 a, would converge absolutely. Consequently the 
radius of convergence is 1. 


Theorem 6.6.5 (Abel’s theorem) Suppose that the series °° 4 an is 
convergent, to s, say. Then a Anx" > s asx / 1. 


Here we only consider real values of x. A stronger result is obtained in 
Exercise 6.6.2. 


Proof By replacing agp by ap — s, we can suppose that s = 0. Let s, = 
i-0 4, for n € Zt. The sequence (sn)72o is bounded: let M = sup{|sn| : 
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n € Zt}. If 0 < x < 1, the series °°) anx” converges absolutely; let 
its sum be f(x). The series }°°° ) x” also converges absolutely, and so by 
Proposition 4.6.1, the convolution product $77? 9 cnx” converges absolutely 
to f(z)/U — 2). Bute, —s,, and so f(e) = (1 — a) SF a en”. 

Suppose that 0 < € < 1. Let 7 = €/2. There exists mo such that |s,| < 7 
for n > ng, and so 


(Ca re: |< n(1-2) So" < n(1- 2) y= 
N=No N=No 


On the other hand, 


nol 


\(1—2) S> sp2”| < (1 — 2)M no. 
n=0 


If 1 —7/(M+1)no < x < 1 then |(1 — 2) ire Snx"| <n, and so 


oo oo 
1S ane"| = [F(@)| = [(1— 2) So sna” < 29 = 
n=0 n=0 


The next result involves a decreasing sequence of non-negative coefficients. 


Proposition 6.6.6 Suppose that (an)°29 is a decreasing null-sequence of 
positive numbers, and that \>~ 9 anz" has radius of convergence 1. Suppose 
that0 <6 <1. Then S7°° 9 anz" converges uniformly on the set 


={zeC: |z| <1,|z-1| > 6}. 
Proof Let tn(z) = 2”, for z € Ps, so that t, € C(P;). Then 
1 — ntl 


ue =i oei=| = 


j=0 


2 
<3 
~ 6 


so that | 0 t| < 2/6. The result now follows from Dirichlet’s test for 


uniform convergence (Exercise 6.5.4). 


Suppose that the power series )77° 9) an2” has positive radius of conver- 
gence R, and that ag 4 0. The function f(z) = )7°°.) anz” is continuous on 
Ur = {z: |z| < R} and so there exists 0 < r < R such that if |z| < r then 
f(z) 40, and we can consider the function 1/f(z). Can it be expressed as a 
power series? 
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Theorem 6.6.7 Suppose that the power series \>>- 9 Gnz" has positive 
radius of convergence R, and let f(z) = 3772.9 anz” for z € UR. Suppose that 
0<S< R, and that f has no zeros in the disc Us = {z : |z| < S}. Then 
there exists a power series \>>- 4 Cnz" with positive radius of convergence T 
such that, if we set g(z) = 7P2.9 cn2” for z € Up, then f(z)g(z) = 1 for 
|z| << min(S,T). 


Proof By multiplying f by aj | we can suppose that ap = 1. (We do this to 
simplify the calculations.) Since the series }>?°_4 @nz” converges absolutely 
for |z| < R, and since the function S°°~_, |an|t” is continuous on [0, R), there 
exists t > 0 such that )°°°, |an|t” < 1. 

In order to see how to proceed, we consider the product of the two series. 
We require that co = 1 and that F=0 ajCn—j; = 0 for n € N. Thus we require 
that 


n 
= SG for 7 EN. 
j=l 
This provides a recursive formula for the sequence (c,)?2.4. We now show 
that the series }>°° 9 ¢nz” has radius of convergence at least t. First we show, 
by induction, that |c,|t” <1 for all n. The result is true if n = 0. Suppose 
that it is true for 7 <n. Then 


n 
Jenlé” = | > (ajt?)( Cp— jt” ”) <b |a,|t”) (len—s|t"~ Ns > Jail <1, 
j=l 


j=l j=1 


establishing the claim. If |z| < t then \77°.9 |enz”| < S272 g([z|/t)” < 00, so 
that the series 77° 9 cnz” has positive radius of convergence T, with T > t. 
Finally, if |z| < min(S,T) then f(z)g(z) = 1, by Proposition 4.6.1. 


Exercises 


6.6.1 Suppose that f is a complex-valued function on a subset A of C. Show 
that f is continuous on A if and only if its real and imaginary parts 
are continuous, and if and only if f is continuous. Show that |f| is 
continuous if f is. 

6.6.2 Suppose that the series 7°? 9 an is convergent, to s, say. Suppose that 
K > 0. Let Wx = {z: |l-—2| < K(1 — |z|} Sketch Wx. Show that 
0 ane" > sas z—> 1in Wr. 

6.6.3 State and prove Weierstrass’ uniform M test for complex-valued 
functions. 
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6.6.4 Dirichlet’s test for uniform convergence: the complex case. Suppose 
that ( fio is a decreasing sequence of non-negative real-valued func- 
tions on a set S which converges uniformly to 0 and that (25) Fo isa 

sequence of complex-valued functions on S for which the sequence of 

partial sums ()7""_9 2j)72o is uniformly bounded: there exists M such 
that |) 2;(s)| < M for alln € Z* andalls € S. Use Abel’s formula 


to show that Lo a;z; converges uniformly. 
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Differentiation 


7.1 Differentiation at a point 


We now restrict attention to real-valued functions defined on an interval. 
Suppose that f is a real-valued function on an interval J, and that a is an 
interior point of J, so that there exists 7 > 0 such that (a—7,a+ 7) C I. 
Then f is differentiable at a, with derivative f’(a), if whenever € > 0 there 
exists 0 < 6 < 1 such that if 0 < |x —a| < 6 then 


f(x) = fl@) 


Ja AY — F(a) 


Ses 


In other words, (f(x) — f(a))/(~—a) > f'(a) as x > a. Thus if f is differen- 
tiable at a, then the derivative f’(a) is uniquely determined. The derivative 
f’(a) is also denoted by 4 (a). 

Note that if 0 < |x — a| < min(d, 1) then 


ste) - fl] s 


|x — a] <e, 


so that f is continuous at a. 
This definition of the derivative involves division. It is convenient to have 
characterizations which avoid this. 


Proposition 7.1.1 Suppose that f is a real-valued function on an interval 
I, that a is an interior point of I, that (a—n,a+n) CI and thatle R. 
The following are equivalent. 


(1) f is differentiable at a, with derivative |. 
(ii) There is a real-valued function r on (—n,7) \ {0} such that 


f(at+h) = f(a) +lh4+r(h) for0 <|h| <7 


for which r(h)/h = 0 as h > 0. 
173 
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(iii) There is a real-valued function s on (—n,7) such that 
flath) = fla) + (+ s(h))h for |hl <9 
for which s(0) =0 and s is continuous at 0. 
Proof Conditions (i) and (ii) are equivalent, since 
r(h) _ fla+h) - fa) 
h h 
and (ii) and (iii) are equivalent, since s(h) = r(h)/h for h £0. 


l, 


There are several closely related reasons for considering differentiability. 
Suppose that b € J and that b # a. Then the graph of the function |, ) 
defined by 


lal) = f(a) + I wa) 


is a straight line which includes the line segment [(a, f(a)), (b, f(b))]. The 
quantity (f(b) — f(a))/(b—a) is the slope of the line. Thus f is differentiable 
at a, with derivative f’(a), if and only if the slope tends to f’(a) as b tends 
to a. If so, then the graph of the function t, defined by 


ta(x) = f(a) + f’(a)(@ — a) 
is the tangent to the graph of f at a. 


A 
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Figure 7.1. Differentiation, and the tangent. 
If |h| < 7, and we write 
flath) = tala th) +r(h) = f(a) + f(@h+r(h) 
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then r(h)/h — 0as h — 0, so that r(h) = o(|h|) and tg is a linear approxima- 
tion to f near a. Further, a small change h in the variable produces a small 
change approximately equal to f’(a)h in the value of the function f, so that 
f'(a) is the rate of change of f at a. 

Let us give some easy examples. 


Example 7.1.2 The function f(z) = x”, with n € N, n> 2. 


By the binomial theorem, 
f(a th) =a"+na"'h+r(h), where r(h) = h2q(h), 


with q a polynomial in h of degree n — 2. Thus r(h)/h — 0 as h — 0, and so 
f is differentiable, with derivative na”—!. 
Example 7.1.3 The function f(z) = 1/z, on (0,00), or on (—oo, 0). 

If 0 < |h| < Ja] then 


flat h) — f(a) = eas so that HOF NM), _F 


as h — 0. Thus f is differentiable at a, with derivative —1/a?. 


Example 7.1.4 The real exponential function exp() on R. 


Since exp(a + h) = exp(a)exp(h), it follows that exp(a + h) = 
exp(a) + exp(a)h + s(h)h, where 
_ exp(h) -1—-h ee oe 
s(h) = exp(a) POP espa) (Se y..), 
so that if |h| < 1 then 

exp(a)|h| y exp(a)|h| 

yep AGA Gi) ra pereataci od AA) 

a(t)] < SPIE (a + Jal + [ale +--+) = SPT 


and s(h) — 0 as h — 0. Thus exp is differentiable, and the derivative at a is 


exp(a). 
Here are some basic properties of differentiation. 


Proposition 7.1.5 Suppose that f and g are real-valued functions on an 
interval I, that (a—n,a+ 1) CI, and that f and g are differentiable at a. 
Suppose also that ,4€R. 


(i) The derivative f'(a) is unique. 
(ii) Af + ug ts differentiable at a, with derivative \f'(a) + pg'(a). 
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(iii) The product fg is differentiable at a, with derivative f'(a)g(a) + 
f(a)g'(a). 

(iv) If f is an increasing function, then f'(a) > 0. 

(v) If f'(a) > 0 then there exists 0 < 6 <7 such that f(x) < f(a) < fly) 
fora-d<ua<a<y<atdo. 


Proof (i), (ii) and (iv) follow immediately from the definition. 
(iii) If 0 < |h| <n, 
fla+ h)g(a on) — F(a)g(@) _ 
_ (2 + zi = io) Pe eee, (a + 7 = uo) 
= f'(a)g(a) + f(a)g'(a) as h — 0. 
(v) There exists 0 < 6 < 7 such that 


flath)- f(a)» 
? (a) 


< |f’(a)| for 0 < |h| < 6, 


from which it follows that (f(a +h) — f(a))/h > 0 if 0 < h < 6 and 
(f(a+h) — f(a))/h <0if0>h>—6. 


It is tempting to suppose that if f’(a) > 0 then f must be an increasing 
function in some interval (a — 6,a + 6). Exercise 7.1.3 shows that this is not 
the case. 

Next we turn to the composition of two functions. 


Theorem 7.1.6 (The chain rule) Suppose that f is a real-valued function 
on an open interval I, that g is a real-valued function on an open interval J, 
and that f(I) C J. Suppose that a € I, that f is differentiable at a and that 
g is differentiable at f(a). Then the composite function go f is differentiable 
at a, with derivative (go f)'(a) = g/(f(a))f'(a). 


Proof First let us give an inadequate ‘proof’. For small h, 


g( F(a +h)) — 9(F(@)) _ g(fla+h)) — 9(f(@) flath) - f@ 


h fa+h)—f(@) — h ) 
Since f is continuous at a, f(a +h) — f(a) - 0 as h > 0, and so 


g( f(a + h)) = 9(f(@) 
fla+h)— f(a) 


Since (f(a +h) — f(a))/h > f'(a) as h — 0, the result follows. 


> J (f(a)) ash + 0. 
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What is wrong with this ‘proof’? It may happen that f(a+h) = f(a), in 
which case, the expression (*) makes no sense. We must avoid dividing by 0. 

We consider two possibilities. First, there exists 6 > O such that 
(a—6,a+ 6) C I and f(a+h) ¥ f(a) for 0 < |h| < 6. In this case the 
preceding argument is valid. 

Secondly, a is the limit point of a sequence (a,)°2, in I \ {a} for which 
f(an) = f(a). In this case it follows that f’(a) = 0, and we must show that 
g'(f(a)) = 0. Let b = f(a). We use Proposition 7.1.1. There exists 7 > 0 such 
that (b—7,b+ 7) C J and a function t on (—7, 7), with t(0) = 0, such that 
g(b+k) = g(b) + (g'(b) +t(k))k for k € (—n,7) and such that t is continuous 
at 0. Similarly, there exists 6 > 0 such that (a — 6,a+ 6) C I and a function 
s on (—6,6), with s(0) = 0, such that f(a+ h) = b+ (s(h)h for h € (—6, 6) 
and such that s is continuous at 0. Since f is continuous, we can suppose that 
f((a—6,a+6)) C (b— 7,64). If 0 < |h| < 6 then 


g(f(a+ h)) = g(b+ s(h)h) = g(b) + (g'(b) + t(s(h)h))s(h)h 


so that 


g(fla+ ") — HF) _ (g'(b) + t(s(h)h))9(h) + 0 as k= O, 


since s(h) — 0 and t(s(h)h) ~ 0ash— 0. 


Corollary 7.1.7 Suppose that g is a real-valued function on an open inter- 
val I, that a € I, that g(a) £0 and that g is differentiable at a. Then there 
exists 6 > 0 such that (a—6,a+6) CI and g(x) £0 fora—d <a<a+to. 
The function 1/g on (a — 6,a+ 6) is differentiable at a, with derivative 
~9'(a)/(9(4))?. 

Further, if f is a real-valued function on I which is differentiable at a, 
then the function f/g on (a—6,a+4) is differentiable at a, with derivative 


VE LONORIONO. 
(Z) aye 


Proof We can suppose without loss of generality that g(a) > 0. Since g 
is continuous at a, there exists 6 > 0 such that (a — 6,a+ 6) C I and 
|\g(x) — g(a)| < |g(a)| for ja —a| < 6. Then g(a) > 0 for x € (a—6,a+ 0). Let 
h(y) = 1/y for y € (0,00). Then h is differentiable at g(a), with derivative 
—1/g(a)?. The first result therefore follows from the chain rule, applied to the 


functions g and h. The second result then follows from the product formula, 


applied to the functions f and 1/g. 


Suppose that f is a strictly increasing continuous function on an open 
interval I. Recall that f(Z) is an open interval, and that f~! : f(I) - I is 
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continuous (Corollary 6.4.3 and Proposition 6.4.5). Suppose that f is differ- 
entiable at a € I. Then f’(a) > 0, but it can happen that f’(a) = 0 [for 
example, if f(x) = x° for x € R then f is strictly increasing and continuous, 
and f’(0) = 0]. But if f’/(a) > 0 then f~! is differentiable at f(a). 


Theorem 7.1.8 Suppose that f is a strictly increasing continuous function 
on an open interval I, that f is differentiable at a € I, and that f'(a) > 0. 
Let b= f(a). Then f—' is differentiable at b and (f~+)’(b) = 1/f'(a). 


Proof Suppose that « > 0. Since (f(a + h) — f(a))/h — f'(a) ash — 0, 
since f(a +h) — f(a) 40 for h £0 and since f’(a) 4 0, it follows that 


h 1 
fcm=fa Fae 


Thus there exists 7 > 0 such that (a — 7,a+ 7) C I and 


h 1 
flath)— fla) f'(a) 


By Proposition 6.4.5, the inverse mapping f~!; f(I) — J is continuous. 
There therefore exists 6 > 0 such that (b — 6,b+ 6) C f(Z) and such that 
|f (b+ k) — f-1(b)| < m for |k| < 6. Suppose that 0 < |k| < 6; let 
h = f-'(b+ k) —a, so that f-'(b+k) =a+h. Then 0 < |h| < 7 and 
f(a+h) — f(a) =k. Consequently 


f beef) 4 
k fi(a)| 


< € for 0 < |h| < 7. 


It is at times useful to consider one-sided derivatives, for example at the 
end points of intervals. Suppose that f is a real-valued function on an interval 
I, and that [a,a+) C I. Then f is differentiable on the right at a, with 
right-hand derivative f'(a+), if (f(a) — f(a))/(a@— a) > f’(at) asa Qa. 
If so, then f is continuous on the right. Differentiability on the left and the 
left-hand derivative f’(a—) are defined similarly. Then f is differentiable at 
an interior point a if and only if it is differentiable on the right and on the 
left and f’(a+) = f’(a-). 

It is important to realize that differentiability is a very special property. 


Example 7.1.9 A bounded continuous function s on R which is not 
differentiable at any point. 
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Let fo = f be the saw-tooth function defined in Section 6.3: 


{x} for 2k <a < 2k+1, 
fo(x) = 
1—{z} for2k+1<a2< 2k+2, 


for k € Z. Thus fo is continuous, is periodic, with period 2 (that is, f(~+2) = 
f(x) for all  € R), and is linear, with derivative +1, in each open interval 
(k,k +1), with k € Z. 

Next we define fn, for n € N. We set fn(x) = fo(6"x)/2". Thus fo is 
shrunk by a factor of 1/2", but oscillates more rapidly. Let us list some of 
the properties of f,. Suppose that 7 € R. 


TO fia 1/2", 

2. fx is linear on intervals of length 1/6”, and has derivative +3” on 
each interval. Thus there exists x, such that |z, — z| = 1/6"+! and 
|fn(tn) — fr()| = 3" [an — a2]. 

3. My <n then | fan) — fy(2)| = 3) tn = 2 

4. f; is periodic, with period 2/6/, so that if j > n then f;(x) = f;(xn). 


Now let s(x) = )0%_, fj(v). Then sp is a continuous function on R. By 
(i), and Weierstrass’ uniform M test, s,(a2) converges uniformly on R to a 
continuous function, s(x) say, as n — oo. Further |s(x) — sp(x)| < 1/2”. 

Suppose that « € R. We show that s is not differentiable at 2. Suppose 
that n € N and that 2, is defined as above. Then by (iv), s(a,) — s(x) = 
Sn(Ln) — 8n(x). Now 


n-1 
EGze = 8n(2)| = Seen) =] In(x)| — S> lf; (en) _ F;(a)| 
j=l 


n-1 
3°43 
= 8"lan~ 2} ~ Yen — zl = ( 5 ) fen — ah 
I= 


by (ii) and (iii). Thus x, — x as n — ov, while 


5(@n) — 8(2) 


>oo as Nn OO, 
so that s is not differentiable at x. 


Exercises 


7.1.1 Suppose that n € N. Show that the function f(a) = x!/" on I = (0,00) 
is differentiable at each point of J, and find its derivative. 
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7.1.2 Where is the function f(x) = |x| differentiable? Let (q;)j2, be an 
enumeration of the rational numbers in (0,1). Let 


<> |e = 95 
f(z) =e 55 i’, for x € (0,1). 
J= 


Show that f is a continuous function on (0,1). Show that f is not 
differentiable at the rational points of (0,1) and that f is differentiable 
at the irrational points of (0, 1). 

7.1.3 Let vn = 1/2” and let yn = 2, +1/5", so that y, > 21 > y2 > 42>... 
Define a real-valued function f by setting 


A be meet 63 
Hei ESE Gee ipiage, 
Yn — Xn 
GA a ale : 
= es if Ynti < |z| < an, 
n — Ynt+ 


=0, otherwise. 


Sketch the graph of f. Let g(x) = x + 2? f(x). Show that f is differ- 
entiable at 0 and that g/(0) = 1. Suppose that 6 > 0. Show that there 
exist 0 << a <b <0 such that g(a) > g(b). 


7.2 Convex functions 


We now consider an important class of functions, with interesting continuity 
and differentiability properties. 

Suppose that F is a real or complex vector space, and that u,v € E. Let 
a: [0,1] > E be defined by 


o(t) =u+(v—u)t = (1-t)ut+tv for0<t<1. 


Then o([0, 1]) is the straight line segment |u, v] between u and v. A subset C 
of F is convex if [u,v] C C, for each u,v in C. Thus a subset of R is convex 
if and only if it is an interval. 

Suppose that f is a function on an interval I. f is said to be conver if the 
subset {(2,y) € R?: 2 € I,y > f(x)} of R? is a convex set. Equivalently, if 
xo, 1 € I, then the straight line segment [(xo, f(xo)), (a1, f(v1))] in R? lies 
above the graph G; = {(z, f(z)) € R?: x € I}. Since 


[(xo, f(xo)), (1, f(w1))] = 
{((1 — t)o + tx, (1 — t) f(zo) + tf(a1)) :0<¢ < Lf, 
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this says that 


(1—t)f(vo) +tf(a1) 2 f((L — t)20 + tar) 


for all zo,7, € JandallO<t< 1. 
We say that f is strictly convex if 


(1—t)f(zo) +tf(zi) > f((1 — t)zo + tx1) 
for distinct 29,21 € J and all0 <t <1. f is concave if —f is convex; that is, 
(1—t)f(vo) +tf(x1) < f(1—t)xo +ta1) for all xo, 271 € JandallO<t<1. 


Strict concavity is defined similarly. 
The next proposition provides some alternative characterizations of con- 
vexity. 


Proposition 7.2.1 Suppose that f is a real-valued function on an open 
interval I. The following are equivalent. 


(i) f ts convex. 
(it) Ifa,b,c€ IT anda<b<c then 


LO = Te) < fe) = fe) 
(iti) If a,b,c € I anda <b<c then 

fo = fe) < Mo ~ £0) 
(iv) If a,b,c € I anda <b<c then 

fo) -f@ — fl - FQ) 
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Figure 7.2. A convex function. 


Proof Let t = (b—a)/(c—a), so that 0 <t<1,1—t=(c—b)/(c—a) and 


=f . $— 
pS at “c= (1 ta + te. 
c-@ ca 


The proof is then simply a matter of using this equation in the definition of 
convexity, and rearranging the inequality. For example, if f is convex, then 


£0) < a) + 2=* 400, 
so that 
(0) — fla) < DED p(y 4 P= pie) = P= 8 F(0) — fa), 


which gives (ii). Conversely, if (ii) holds, and if zo < x, and 0 < t < 1 then 
setting 2; = (1 — t)xo + ta, 
f(w1) — f (xo) 


Flee) = $00) | 


Lt — XO v1 — XO 


Since x; — %9 = t(x1 — Xo), this gives 


f(we) S f(ato) + Uf (@1) — f(xo)) = (1 — 1) f (wo) + tf (v1). 


The other equivalences are proved in a similar way. 


Here are some basic properties of convex functions. 


Proposition 7.2.2 (i) If f and g are convex functions on an interval I 
anda> 0 then f +g and af are conver. 

(it) If (fn)P21 is a sequence of conver functions on an interval I, and if 
fn(x) — f(x) as n — 00, for each x € I, then f is convex. 


7.2 Conver functions 183 


(ii) If {f : f © F} ts a family of convex functions on an interval I for 
which g(x) = sup{ f(x): f € F} ts finite for each x € I then g is convex. 

(iv) If f,g are convex, non-negative increasing functions on an interval I 
then fg is convex. 

(v) If f is a convex function on an interval I, and if @ is an increasing 
convex function on an interval J which contains f(I), then go f is a convex 
function on I. 


Proof (i) and (ii) follow immediately from the definitions. 

We suppose that x9, x1 € J and that 0 < t < 1, and we set 2, = (1—t)ag+ 
tx1. 

(iii) Suppose that « > 0. There exists a function f in F’ such that f(x) > 
g(t) — €. Then 


g(a) —€ S f(a) <1 — tf (ao) + tf (a1) < (1 — t)g(ao) + tg(a). 


Since this holds for all € > 0, g(az) < (1 — t)g(xo) + tg(a1). 
(iv) Since f and g are increasing, 


(g(a1) — g(@o))(F(@1) — f(wo)) 2 0. 


Expanding and rearranging, 


f(xo)g(x1) + f(x1)9(20) < f(xo)g(xo) + f(x1)9(21), 


< ((1 — t) f (wo) + tf(w1)) (C1 — t)g(ao) + tg(w1)) 

= (1 —t)* f(wo)9(wo) + (1 — t)(F (wo) 9(@1) + f(@1)9(@0)) + PF (ai) 9(@1) 

< (1)? f(xo)9(wo) + #1. — 2)(f(@0) 9(@0) + f(@i)9(@1)) + OF (er) 9(@1) 
(1 — t) f(xo)9(xo) + tf (a1) 9(21). 


(v) Since ¢ is convex and increasing, 


O(F(@t)) < (1 — t) f(x) + tf(#1)) < CL — t)o(f(a0)) + to(F (x1). 


We now turn to continuity and differentiability properties of convex 
functions. 
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Theorem 7.2.3 Suppose that f is a convex function on an open interval I. 

(i) f is continuous on I. 

(ii) f is differentiable on the right and on the left at each point a of I, 
and f'(a-) < fiat). 

(iii) If a <b then f'(at+) < f’(b-). 

(iv) The mapping a > f'(a*) is increasing, and is continuous on the right 
at each point a of I. 

(v) The mapping a > f'(a~) is increasing, and is continuous on the left 
at each point a of I. 


Proof (i) is a consequence of (ii). 

(ii) Suppose that y < a < a, with 2,y € I. By Proposition 7.2.1, the 
function « — (f(a) — f(a))/(x — a) is an increasing function on IM (a, 00), 
bounded below by (f(a) — f(y))/(a — y). Thus (f(x) — f(a))/( — a) tends 
to a limit f’(a+) as xz \, a, and (f(a) — f(y))/(a—y) < f’(a+). Similarly, 
(F(a) - f(y) (a—9) > f(a-), and f(a-) < flat). 

(ii) F"(at) < (F(0) — Fla))/(b—a) < f'6-) 

(iv) By (ii) and (iii), if a < 6 then f’(a+) < f’(b-) < f’(b+), so the 
mapping x — f’(#+) is increasing. Suppose that a € J. Given « > 0, there 
exists 6 > 0 such that (a,a+6) C I and 


He) = He) « 
Choose b € (a,a+ 6). Since (f(b) — f(x))/(b-— 2) — (f(b) — f(a))/(b— a) as 
x \, a, there exists 0 < 7 < 6 such that 

fo) —f@) _ fla) - f@) 


pag ee +¢/2 < f’(at+)+€ for z € (a,a+n). 


f'(at) < (a+) + €/2 for x € (a,a+0). 


Thus if x € (a,a+7) then 


f'(at) < fi'(x+) 


IA 


The proof of (v) is exactly similar. 


Corollary 7.2.4 The set D of points of discontinuity of the mapping 
a — f'(a+) is countable. D is the set of points of discontinuity of the 
mapping a > f'(a—), and is the set of points at which f is not differentiable. 


Proof Since the mapping a — f’(a+) is increasing, D is countable, by The- 
orem 6.3.5. Suppose that d € D. Since the mapping y — f’(y—) is continuous 
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on the left at d, 
"(d—) = lim f’(y—) < lim f'(y+) < f/(d+), 

da) We as en re a) 
so that f is not differentiable at d. Further, f’(d+) < f’(z—) for z > d, so 
that f’(d—) < lima f’(z—); d is a point of discontinuity of the mapping 
a— f'(a—) as well. 

Conversely, if c ¢ D then 
f'(c—) = lim f'(y—-) = lim f'(y+) = f'(c+), 
yf y/c 

so that f is differentiable at c, and 


fi(e-) = F(c+) = lim f'(e-), 


so that the mapping x > f’(x—) is continuous at c. 


Exercises 


7.2.1 Give an example of a convex function on [0,1] which is discontinuous 
at 0 and at 1. 
7.2.2 A real-valued function on an interval I is midpoint-convex if 


f((a + b)/2) < (f(a) + f(b))/2 for all a,b € I. 


Suppose that f is a midpoint-convex function on I. 
(a) Suppose that c—h,c,c+h € I, where h > 0. Show that ifn ¢ N 


then 
He=W=$O « Hes n/m) so) < EAN =O, 
He+W=$O « pe bj) se) < EW =1O. 


(b) Show that if f is bounded on J then f is continuous at c. 
(c) Show that if f is bounded on J then f is convex. 

7.2.3 State and prove results corresponding to Propositions 7.2.1 and 7.2.2 
and Theorem 7.2.3 for strictly convex functions. 

7.2.4 Suppose that f is a convex function on an interval J, that 71,...,%p are 
distinct points in J and that p,,...p, are positive numbers for which 
i=1 Pj = 1. Show that 


FS pj) < Si e- [Jensen’s inequality] 
j=l j=l 


Show that the inequality is strict if f is strictly convex. 
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7.2.5 Suppose that f is a convex strictly increasing function on an open 
interval J. Show that the inverse function f~! is concave and strictly 
increasing on f(J). 


7.3 Differentiable functions on an interval 


Proposition 7.3.1 Suppose that f is a real-valued function defined on an 
interval I and that f has a local maximum or local minimum at an interior 
point c of I. If f is differentiable at c then f'(c) = 0. 


Proof Suppose that f has a local maximum at c. Then 


F(o) = tn POZO <9 and fo) = tim “I=L 5 


and so f/(c) = 0. The proof when f has a local minimum at c is exactly 


similar. 


A function on an open interval J which is differentiable at every point of 
the interval is said to be differentiable on I. 


Theorem 7.3.2 (Rolle’stheorem) Suppose that f is a real-valued function, 
defined on a closed interval [a,b], which is continuous on the closed interval 
[a,b] and differentiable on the open interval (a,b). Suppose that f(a) = f(b). 
Then there exists c € (a,b) such that f’(c) = 0. 


Proof If f(x) = f(a) for all x € (a,b) then f’(x) = 0 for all x € (a,b). 
Otherwise f is not monotonic on [a,b], and therefore, by Corollary 6.3.7, 
has a local maximum or local minimum at an interior point c of [a,b]. Then 
f'(c) = 0, by Proposition 7.3.1. 


Corollary 7.3.3 Suppose that f’(x) 4 0, for each x € (a,b). Then f is 
strictly monotonic. 


Proof If not, f is not injective (Proposition 6.4.4), and so there exists a < 
a’ <b! < b for which f(a’) = f(b’). But then there exists a’ < c < b! with 
f'(c) = 0, giving a contradiction. 


As Exercise 7.5.6 shows, if f is differentiable on an interval then the deriva- 
tive need not be continuous. Nevertheless, it satisfies an intermediate value 
property. This property is known as Darboux continuity. 


Theorem 7.3.4 Suppose that f is a real-valued function which is dif- 
ferentiable on the open interval (a,b) and that f'(c) < f'(d) for some 
a<c<d<b. If fi(c)<u< f'(d) there exists c<e<d with f'(e) =v. 
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Proof We apply a shear to the graph of f: let h(x) = f(x) — va. Then 
h'(c) = f'(c) —v < 0 and h’(d) = f’(d) — v > 0. There exists e € [c,d] 
such that h(e) = inf{h(x) : x € [c,d]}. By Proposition 7.1.5 (v), there exists 
0 <6 <d-—csuch that h(x) < h(c) force < « < c+ and h(x) < h(d) for 
d—06 <a <d,and so e must be an interior point of [c,d]. Thus h has a local 
minimum at e, and h’(e) = f’(e) —v =0. 


We applied a shear to obtain this result. We do it again to obtain a mean- 
value theorem. 


Theorem 7.3.5 (The mean-value theorem) Suppose that f is a real-valued 
function which is continuous on the closed interval |a,b| and differen- 
tiable on the open interval (a,b). Then there exists a < c < b with 


f'(c) = (F(0) — F(a))/( — a). 
Proof Let hy(x) = f(x) — Ax. If we set A = (f(b) — f(a))/ 
h(a) = hy(b), and so there exists a < c < bsuch that h\(c) = f’ 
Thus f"(c) = (f(b) — f(a))/(b— a). 

This theorem says that there is a point c in (a,b) at which the tangent to 


the graph of f is parallel to the chord joining (a, f(a)) and (6, f(b)). 
The next corollary is ‘obviously’ true, but it is not a trivial result; it is 


b— ae 
(c)- 


however an immediate consequence of the mean-value theorem. 


Corollary 7.3.6 Suppose that f is a real-valued function which is continu- 
ous on the closed interval [a,b] and differentiable on the open interval (a,b). 
If f'(x) =0 fora<a <b then f is constant on [a,b]: f(a) = f(x) = f(d) 
for all x € [a,b]. 


Here is a more sophisticated mean-value theorem. 


Theorem 7.3.7 (Cauchy’s mean-value theorem) Suppose that f and g are 

real-valued functions which are continuous on the closed interval [a,b] and 

differentiable on the open interval (a,b), and suppose that g/(x) #0 for all 
€ (a,b). Then g(a) 4 g(b), and there exists c € (a,b) such that 


Proof By Corollary 7.3.3, g is strictly monotonic on (a,b), and so g(a) 4 
g(b). Let 


f — f(a) = re) x 
5) =a) 24 let hy(x) = f(x) — Ag(2). 
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Then h)(a) = hy(b), and so there exists a < c < 6 such that hi(c) = 
f'(©) — Ag'(©) = 9. Thus f"(c)/9'(c) = (f(b) — f(@))/(g(®) — g(a). 


Corollary 7.3.8 (L’Hopital’s rule) Suppose that f and g are real-valued 
functions which are continuous on the closed interval |a,b| and differentiable 
on the open interval (a,b), that f(a) = g(a) = 0 and that g'(x) 4 0 for all 
x € (a,b). If f'(x)/g'(a) +l asx \, a then f(x)/g(x) = l asa \ a. 


Proof Suppose that « > 0. There exists 0 < 6 < 6b—a such that 
|f'(x)/g'(z) —l] < efora <a <a+6.Ifa < x < a+6 there exists 
a<c< «such that 


f(z) _ f(x) - f(a) _ f'O 


g(x) g(z)- gla) ge)’ 


and so 
5 -|=|F5-]<« 
Exercises 


7.3.1 Suppose that f is continuous on [a,b] and differentiable on (a,b) and 
that f has a derivative on the right at a [which we denote here by f’(a)| 
and a derivative on the left at b [which we denote here by f’(b)]. Show 
that if f’ is continuous on [a,b] and € > 0 then there exists 6 > 0 such 


that 
f(x) — f(y) 


aia — f'(x)| < if x,y € [a,b] and |z — y| < 6. 


7.3.2 Suppose that ao,...,@, € R and that 


ay, An-1 an 
a ae, eee = 0 
Big te 


Show that there exists 0 < c < 1 such that 


Opie ae eae =. 


7.3.3 Suppose that f is continuous on [a, b] and differentiable on (a, b). Show 
that f’ is an increasing function on (a,b) if and only if f is convex. 

7.3.4 Suppose that f is a real-valued function on an interval I which satisfies 
| f(x) — fly)| < |x — y|? for all 2, y € I. Show that f is constant. 

7.3.5 Suppose that f is a differentiable function on R and that f’(x) — las 
x — +00. Show that f(x)/x — las x > +00. 
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7.3.6 Suppose that f is continuous on [a,b] and differentiable on (a,b), and 
that f’(@) > las x \, a. Show that f is differentiable on the right at 
a and that f’(a+) = l. 

7.3.7 Suppose that f is a differentiable function on [a,b] and that f’ is con- 
tinuous on [a,b]. Let N = {x € [a,b] : f’(a) = 0}. Suppose that € > 0. 
Show that there are finitely many disjoint intervals 1,,...,Z; in [a, }] 
such that N C Uk_ Tj and such that | f’(x)| < € for x € Uk_iTj. Show 
that f(V) is a closed subset of R with no interior points. 

7.3.8 Suppose that a is an algebraic number which is not rational. Show that 
there exists a non-zero polynomial p(x) = a,x” + +--+ ag with integer 
coefficients such that p(a) = 0, whereas p(r) 4 0 for r € Q. Thus if 
r = p/q then q"p(r) is a non-zero integer. Let 


M =sup{|p'(x)|:a-l<a2<a+l}. 


Suppose that r = p/q € Q and that |r — a| < 1. Use the mean-value 
theorem to show that |r—a| > 1/Mq”. (This result is due to Liouville.) 
Let a = 7°, 10-™. Show that a is not rational. Show that x is not 
algebraic. 


7.4 The exponential and logarithmic functions; powers 


We now consider how the results that we have obtained can be used to 
establish properties of some of the fundamental functions of analysis. 
We have defined the real exponential function 


exp(t) =1+ +5) pre] ae ee 


for x € R, and have shown that exp(x + y) = exp(x) exp(y), and that exp 
is differentiable, with derivative exp/(x) = exp(x). We set e = exp(1). The 
reader should use these results, and the results that have been proved, to 
justify the following statements. 


. exp is a non-negative strictly increasing function on R. 
. exp is a strictly convex function on R. 
. Ifn € Zt then exp(x)/x2" — oo as © > +00. 


.Ifn € Zt then x2” exp(x) — 0 as x — —o0. 


oP WN 


. exp is a continuous bijection of R onto (0,00) which is an isomorphism 
of the additive group (R,+) onto the multiplicative group ((0, 00), x). 
6. The inverse mapping from (0,00) to R, which is called the logarithmic 
function, and denoted log x, is differentiable, and dlog x/dx = 1/z, for 

0O<r< om. 
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12. 
13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 
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. Ifz,y > 0 then log ry = log x + log y. 

. log x is a strictly increasing strictly concave function on (0, 00). 

. log 1 = 0, loge = 1, logx — oo as & > w and logxz — —oas xz \_ 0. 
10. 
11. 


If m €N then log x/ai/™ —0as 2 — coand x!/™logr 0 as x \_ 0. 
1/x = exp(—log), and if x > 0 and n € N then 


a” = exp(nlog 2) and «!/" = exp((log x) /n). 


Thus if r = p/q € Q then 2/4 = exp((p/q) log 2). 

This leads us to define «* = exp(alogz), for x > 0 anda € R. 
x® is x raised to the power a. Note that, with this terminology, 
exp(x) = exp(zx log e) = e”. In future, we shall usually write e” for exp x. 
Ifz >Oanda,@ € R then 2° = ¢%z? and «® = 1. 

For fixed x > 0, the function a > x° from R to (0,00) is continuous. 
Thus if (7n)nen is a sequence in Q and r,, > a then 2™ — 2% asn — oo. 
For fixed x > 0, the function a > x® from R to (0, 0c) is differentiable, 
with derivative dz®/da = x° log x. 

If z > 1 then the function a — «® from R to (0,00) is a strictly convex 
and strictly increasing bijection of R onto (0, 00). 

If 0 < x < 1 then the function a — x® from R to (0, 00) is a strictly 
convex and strictly decreasing bijection of R. onto (0, 00). 

For fixed a € R, the function « — 2«® from (0,00) — (0,00) is 
differentiable, with derivative dx°/dxz = axr°~!. 

If a> 1 then the function 7 > x° is a strictly increasing strictly convex 
bijection of (0,00) onto (0, 00). 

If 0 < a < 1 then the function  — 2° is a strictly increasing strictly 
concave bijection of (0,00) onto (0, 00). 

If a < 0 then the function x — x° is a strictly decreasing strictly convex 
bijection of (0,00) onto (0, 00). 

A strictly positive function f on an interval I is logarithmically convex if 
log f is convex. Since 


(1 — 4) log f(x) + Olog f(y) = log( f(x) °f(y)”), 
f is logarithmically convex if and only if 
f((1— 0x + Oy) < f(x) f(y)? 


for x,y € I and 0 < 6 <1. Since f = e'°8/, it follows from Proposition 
7.2.2 (v) that a logarithmically convex function is convex. 
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AY 


234 


Figure 7.4. The exponential and logarithmic functions. 


Exercises 


7.4.1 Show that n!/" 4 1 as n— oo. 

7.4.2 Use the strict concavity of log to prove the following generaliza- 
tion of the arithmetic mean-geometric mean inequality. Suppose that 
1,...,€ are positive numbers and that p1,..., py are strictly positive 
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numbers with }°"_, pj = 1. Show that 


GT ay ex5 hy <prei pata Ft pata; 

with equality if and only if vj =a22+---= 2p. 

7.4.3 Suppose that 1 < p,q < oo. If 1/p+1/q = 1, then p and q are called 
conjugate indices. Suppose that p and q are conjugate indices and that 
x and y are non-negative numbers. Show that ry < 2? /p+y%/q. When 
does equality hold? 
Suppose now that 71,...,%n, Y1,---,Yn are real numbers, and that 
Yi" feyyyl # 0. Let SF = (S78, feylP)'P, T = (IP, [ysl and 
let aj = xj/S, bj = yj;/T for 1 < j <n. Show that D0, |ajb;| < 1. 
Deduce that 


nr n nr n 
Sauls wl Oe Py Oe wo 
j=l j=l j=l j=l 


(Holder’s inequality). Note that this generalizes Cauchy’s inequality. 
When does equality hold? Extend this result to infinite sums. 

7.4.4 Here is another version of Hélder’s inequality. Suppose that 71,...,2n, 
Y1,+++,Yn are real numbers, and that a1,...a@n are non-negative num- 
bers. Show that 


n n n n 
| do ajax uy] Sd 7 alesysl S COS gles?) S aglyg lV" 
j=l j=l j=l j=l 


7.4.5 Show that the function log((1 + x)/(1 — 2x)) is a strictly increasing 
bijection of (—1,1) onto R. Show that it is convex on (0, 1). 

7.4.6 Suppose that y > 0. Show that (y — 1)? > y(log y)?. 

7.4.7 Using the convexity of the function 4~”, show that if 0 < x < 1/2 
then 1 — a > 47”. Let (pi, p2,...) be an enumeration of the primes in 
increasing order. Show that, for each n € N, 


-1 


Deduce that as 1/j < 4’, where ty = _* 1/p;, and deduce that 
jel 1/p; = 00 


7.4.8 Use the mean-value theorem to show that if « > 0 then 


xv 
—— < log(1 
ica og(l+2) <2 


Deduce that (1+ 2)!" Aeasx\, 1. 


7.5 The circular functions 193 


7.4.9 Sketch the graph of the function f(x) = xlogz, for x € (0,00). Is it 
convex? Does it have any maxima or minima? What is lim, 09 f(x)? 
What is lim,\o f’(2)? 
7.4.10 Suppose that f is positive and differentiable on an open interval J. Show 
that (log f)’(x) = f’(a)/f(x). Let g(x) = x", for x € (0, co). Calculate 
g(x). Show that g is logarithmically convex. Sketch the graph of the 
function g, answering the same questions as in the previous exercise. 
7.4.11 Investigate 
lim eae a te a and lim a(all® —1) — zh 
a\0 x too =. log 


7.5 The circular functions 


Next we consider the cosine and sine functions. These functions arise in geom- 
etry and trigonometry, but we are not yet in a position to consider this aspect 
of things. Instead, we treat them in a purely analytic way. As we shall see 
later, they also have an important part to play in complex analysis, and this 
will also throw more light on them. 

Each of the power series 


me) 2k 2 4 
_ k z = z z _ 

cosz =) (-1) Gy i ata ’ 

k=0 

oo 2k4+1 3 5 
ae k 2 _ z z 
sine =) (I opaay *- tt 

k=0 


has infinite radius of convergence. 

The cosine function cos is an even function (cos z = cos(—z)) and the sine 
function sin is an odd function (sin z = — sin(—z)). 

Following custom, if n € N we write cos” z for (cos z)" and sin” z for 
(sin z)”. But 1/cosz is denoted by sec z and 1/sin z is denoted by cosec z: 
cos~! and sin~' have quite different meanings (see Exercises 7.5.3 and 7.5.4). 

We restrict attention to the real-valued functions cos and sin, defined on 
the real line R. 


Theorem 7.5.1 cosz is differentiable, and cos'x = —sinz. sinz is 
differentiable, and sin' x = — cos2. 


Proof First we establish an elementary inequality. We prove this for complex 
numbers, since we shall need such an inequality in Volume III. 
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Lemma 7.5.2 Suppose that z,w € C and thatn € N. Then 


n(n — 1) 
2 


Proof The proof is trivially true ifn = 1 or 2. Suppose that n > 3. By the 
binomial theorem, 


n 
(z+ w)” — 2" —nwz"! = wy? ) (") wi 220-5 
jaa J 


n—-2 
7 n(n — 1) n—2 mien. 
2 aren k uh “% 


(z+ w)" — 2 — nw2"™| < [w|?(l2l + fol)" 


so that 
n—-2 
TE: Me M-l)e n(n — 1) n-2 ky) ,jn—k—-2 
ety emus s Tey a de 
n(n—-1 oe 
< MD) (| + fly 


We now prove the theorem. Suppose that h # 0. 


cos(e+h)—cosx 
+ sin x 


so that 


cos(x + h) — cos x 


h 
\(a + h)?* — 2?" — 2khah-!| 
< 
d Ten 


sin x 


A similar argument, left to the reader as an easy exercise, establishes the 


result for sin x. 


Corollary 7.5.3 cos? x + sin? x = 1, so that -1 < cosx <1 and -1 < 
sing <1. 
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Proof Let f(x) = cos? x + sin? x. Then 
f'(z) = 2cosx(—sinz) + 2sinxcosx = 0, 


so that f is constant, by the mean-value theorem. Thus f(x) = f(0) = 1. 


The alternating series test shows that sina > 2 —x°/3! = x(1— 27/6) > 0 
for 0 < x < 2, so that cosz is strictly decreasing on [0,2]. The alternating 
series test also shows that cosx > 1 — x? /2 >Ofor0O<a< V2, and that 
cos V3 < 1—3/2+9/24 = —1/8. Thus, by the intermediate-value theorem, 
there exists V2 < 29 < V3 such that cosxzo = 0. Since the function cos is 
strictly decreasing on [0,2], xo is unique. We set 7 = 220. 

Since sin’ x = cosz is positive on (0, 7/2), sinx is strictly increasing on 
[0, 7/2]. Since 

sin?(1/2) = sin?(1/2) + cos*(1/2) = 1, 
sin0 = 0 and sin(7/2) = 1. Since cos’ x = —sinz is negative and decreas- 
ing on (0, 7/2), cosa is decreasing and concave on [0, 7/2]; since cos x is an 
even function, cos xz is concave on [—7/2,7/2]. Similarly, sin x is convex on 
[—7/2,0] and concave on [0, 7/2]. 
In order to go further, we need the addition formula. 


Theorem 7.5.4 If x,y €R then sin(z + y) = sinxcosy + cos asin y. 


Proof Since the series are absolutely convergent, we can expand the 
products as Cauchy products. 


; = ee 7 a 
sin 7 cos y = De ac al (S-n 5) 


Similarly 


cosasiny = S> (—1)” od 


n=0 \j+k=n 
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Adding, 
sin cosy + cosasiny = > S> (era 
1=0 \j+k=2l+1 ae 
an 


Corollary 7.5.5 cos(x+ y) = cosxcosy — sin xsin y. 


Proof Differentiate with respect to x, or with respect to y. 


Corollary 7.5.6 sin(x + 7/2) =cosx and cos(x + 7/2) = —sinz. 


Proof Put y=7/2. 


Corollary 7.5.7  sin(z +7) = —sinx and cos(a +7) = —cos@. 
sin(a + 27) = sinaz and cos(x + 27) = cosa. 


Thus the cosine and sine functions are periodic, with period 27. 


Proposition 7.5.8 Suppose that (x,y) € R? and that x7 + y? =r? > 0. 
Then there exists a unique 0 € (—1,7] such that x = rcos@ and y= rsin6. 


Proof Suppose first that y is non-negative. Since cos0 = 1 and cosa = —1, 
and since —1 < a/r < 1, it follows from the intermediate value theorem that 
there exists 0 < 6 < m such that x/r = cos 6. 0 is unique, since cos is a strictly 
decreasing function on [0,7]. Then (y/r)? = 1— (a/r)? = 1—cos? @ = sin? 4, 
and so y = rsin@, since y and sin @ are both non-negative. 

If y < 0 then there exists 0 < ¢ < msuch that x = rcos@and —y = rsin@. 
Let 6 = —¢. Then x = rcos@ and y = rsin@, and again, @ is uniquely 
determined. 


We now consider the complex case. Inspection shows that 


e’” + Ee (% ¢ e”” —_ e- 4 
cos z = ———— and sin z = ——___,, 
2 21 


and so we obtain Euler’s formula 


e’* =cosz+ isin z. 


In particular, if « € R then cosz and sin x are the real and imaginary parts 
of 6. 
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y ; 
A y=sint 


A y=cost 


t=-2n t=-0 t=T t=2n 


Figure 7.5. The cosine and sine functions. 


Proposition 7.5.9 The mapping 
te” = cost +isint: R= T = {2: |z| =1} 


is a continuous homomorphism of the additive group (R,+) onto the 
multiplicative group (T,.), with kernel 27Z. 


Proof The mapping is certainly continuous, and is a homomorphism into 
(T, .). It is surjective, by Proposition 7.5.8. Finally, e = cost +isint = 1 if 
and only if cost = 1 and sint = 0, which happen if and only if t = 27k, for 
some k € Z. 
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In fact, most of the properties of the real-valued functions cos and sin can 


t 


be deduced from the equation e’ = cost + isin t. For example, 


cos? ¢ + sin? t = |e’|? = eet — ete“ = 1. 


Here are two more examples. 


Example 7.5.10 IfneéWN then 


0<2k<n 


2n 
and sinnt = Ss" -( ) sin2*t! ¢ egg2?—2k-1 ¢. 
O0<2k<n 


For 


n 


e™ = (cost -+isint)” = Ss" (") (i) sind tcos”~4 t) 
J 


j=0 
b (22) . 2% 2n—2k 
= Ss" (-1) oy, ) 8m t cos t 
0<2k<n 
2 
+i Ss" (0 1) Sie rede 


0<2k<n 


Example 7.5.11 If0< |t)|<mandneéN then 


Ss sees sin(n + 3)t 
Oe I sin t/2 


jaan 


For 


n n 2n 
. cos jt = 5 est = ent : et 
: : F220 


j=r-n jn 


’ é 1 : 1 
ee eil2nt+l)t = 1 ellmta)t — oe i(nt ait _ sin(n + 5)t 


et] @it/2_ e-it/2 gin t/2 


7.5.1 
7.5.2 


7.0.3 


7.0.4 


7.9.9 


7.0.6 


7.0.7 


7.0.8 


7.0.9 
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Exercises 


Show that 227/a < sinz < # for0<a< 7/2. 
Use the mean-value theorem to show that 


1 1 
<—-—-<tfor0<t<~7/2. 
sint t 


Show that the function sin is a continuous strictly increasing bijection 
of [—1/2, 1/2] onto [—1,1]. The inverse mapping is denoted by sin}, 
or arcsin. Show that 


dsin~! 


dx 


1 
(x) = ———— for -1<2<1. 


V1l—2? 


Show that the function cos is a continuous strictly decreasing bijection 
of [0, 7] onto [—1, 1]. The inverse mapping is denoted by cos™!, or arccos. 
Show that 


dcos' 1 
— (a) = for -l<a<l. 


V1l— 2? 


Explain why 
d(sin~! + cos~*) 
dx 
Let f(x) = sin(1/x) for x 4 0 and let f(0) = 0. Sketch the graph of f. 


(cz) =Ofor -l<2<1l. 


(a) For what values of a is the function x° f(x) continuous on R? 

(b) For what values of a is the function x° f(x) differentiable on R? 

(c) For what values of a is the function x° f(a) continuously differen- 
tiable on R? 

(d) For what values of a is the function x + «° f(x) strictly increasing 
on R? 

The tangent function tan is defined as tan xz = sin x/ cos x for —7/2 < 

x < 1/2. Show that it is a strictly increasing differentiable mapping of 

(—1/2,7/2) onto R. Its inverse is denoted by tan~!, or arctan. What 

is the derivative of tan~!? 

Investigate 

x(1—cos x) tan? x — sin? x 


lim : and lim —_—_—— 
a0 x—sina «0 (1—cosz) 


Let f(v) = x+227sin(1/zx), for x 4 0 and let f(0) = 0. Show that f is 
differentiable on R, and calculate its derivative. Show that f’(0) = 1, 
but that there is no interval (—d,6) on which f is monotonic. 
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7.6 Higher derivatives, and Taylor’s theorem 


Suppose that f is a differentiable function on an open interval J. Then, as with 
the functions exp, log, sin and cos, it may happen that f’ is also differentiable. 
We then denote the derivative of f’ by f”, or f), or d?f /dx?. The process 
may continue, and we obtain higher derivatives f(™, or d"f /dx”, of order n. 
If f has derivatives of all orders, we say the f is infinitely differentiable. 

We have the following formula for products. 


Proposition 7.6.1 (Leibniz’ formula) Suppose that f,g are functions on 
an open interval which have derivatives of order n. Then 


n nm n— n n—-Tr r nm 
(fg) = fg + nf Dota (MA Mg) pant fo, 


Proof The proof is by induction on n. It is true for n = 1, by Proposition 
7.1.5 (iii). Suppose that it is true for n. Using the result for n = 1, 


(torn gy! = fOrTtD gr) fe form) gr +) 


so that 
(n+1) = - n (n—r+1) _(r) (n—r) (r+1) 
(f9) > (7) (Goer) 
n+1 ‘a i 
E(Q-(a)yeew 
eer r r—1 
n+1 
ey eke cere 
r=0 . 
since 


by de Moivre’s formula. 


If f is differentiable at a, then the function ta(a) = f(a) + (x — a)f’(a) 
provides a linear approximation to f; if rg = f — ta, then ra(x) = o(|x —a)). 
Suppose that f has higher derivatives, up to order n; can we obtain a better 
polynomial approximation? Let us consider a polynomial which has the same 
derivatives as f at a. Let 


(x — a)? 
2! 


f"(a) See a Gs. 


n! 


Pn(x) = f(a) + (2 — a) f(a) + 


7.6 Higher derivatives, and Taylor’s theorem 201 


Then pp is a polynomial of degree at most n, pn(a) = f(a) and 


(042)(4) 4... 4 79" any 
RES a Ne 
so that p)(a) = f(a) for 1 < s <n. Let rag1 = f — pn: Tn41 is the 
remainder term. Then ry41(a) = 0 and r®) (a) = 0 for1 <s <n, and we 


(a — a)? 


fO(a) + (wa) f(a) + 


might hope that the remainder term is small, so that p, is an even better 
approximation to f, near a. 

Taylor’s theorem provides information about the remainder term. We give 
two versions of this theorem here, and shall give another one in Theorem 
8.7.3. The different versions each depend in detail upon the conditions that 
are placed on f and on its derivatives. 


Theorem 7.6.2 (Taylor’s theorem, with Lagrange’s remainder) Suppose 
that f is a continuous function on [a,b] which is n-times differentiable on 
[a,b) (with one-sided derivatives at a). Then there exists c € (a,b) such that 

(b _ aes 


f(b) = fla) + (6—-a)fi(a) +--+ + “Gor + 


(b— a)” 
n! 


IC) 


= Poa(6) + P=" pone), 


Proof The proof is just like the proof of the mean-value theorem. We 
shall assume that a < b; a similar proof applies if b < a. Let hy(x) = 
f(x) — pn—i(x) — A(a — a)"/n!, where X is a real number chosen so that 
h)(b) = 0. Then hy is continuous on [a,b], hy(a) = 0, and n’?)(a) = 0 ior 
l<s<n. 

We need to show that there exists a < ¢ < b such that \ = f((c). To 
do this, we repeatedly use Rolle’s theorem. Since h(a) = h)(b) = 0, there 
exists a < cy < 6 such that h\(c,;) = 0. Now h\ is continuous on [{a, c1| 
and h(a) = h\(c,) = 0, and so, using Rolle’s theorem again, there exists 
a < cg < c such that h¥(c2) = 0. Continuing in this way (that is, giving a 
proof by induction), we find that there exist a < Gn < Gn-1 <-+++< a1 <b 
such that h\”) (en) = 0. But no) = f) — ); setting c = cn, we see that 
d= fc). Thus 


f(0) = Pn—1(b) =F 


For the second theorem, we impose slightly stronger conditions. 


202 Differentiation 


Theorem 7.6.3 (Taylor’s theorem, with Cauchy’s remainder) Suppose 
that f is a continuous function on [a,b] which is n-times differentiable on 
[a,b) (with one-sided derivatives at a), and for which the derivatives are 
bounded on |a,b). Suppose that k € R and that k > 0. Then there exists 
c € (a,b) such that 


(b—c)"-*(b—a)* 
k(n — 1)! 


f(b) = pn—1(b) 4 fPO. 


If we write c= (1—6,)a+0,b, this becomes 


(1 — n)"-#(b — a)” 


k(n — 1)! £6). 


f(b) = pn—1(b) + 


Proof Suppose that a < b; a similar proof holds if b < a. Let 


for a < x < 6, and let h(b) = f(b). Since the derivatives are bounded on 
(a,b), the function h is continuous on [a,b] and differentiable on (a,b). The 
idea behind this definition is that h(a) = pn—i(b) and 


d € — 1 51 (2)) = (0 — 1) 65) 4) 4 (b— 2)” p40) (a), 


dx s! 


so that 


F b— x)" 1M (z 
na) =| aoa a 


all the other terms cancelling in pairs. Let g(x) = —(b — x)*, so that 
g(b) — g(a) = (b—a)*, and g(x) = k(b— x)*"! £0 for z € (a,)). 


Thus by Cauchy’s mean-value theorem there exists a < c < b such that 
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= Pn—1(0) k(b — c)F-1(n — yt 
=e; n—-k(p a k 
=raW)+© e a ~ 16). 


Suppose that f is infinitely differentiable. Then we can write f(x) = 
Pn(£) + Tn41(x) for each n. We might hope that r,(xz) — 0 as n > ov, 
so that we can write 


Flo) = Fa) + Ca f(a), 
a 


in which case the series on the right-hand side is called the Taylor series 
for f. The following example shows that this is not always the case. Let 
f(x) =e" for « £0 and let f(0) = 0. Then f is continuous on R. If « 4 0 
then f’(x) = (2/23)e—/™"’, and f’(0) = limy_49 e7!/”’ /x = 0. An inductive 
argument then shows that there exists a sequence (sj) of polynomials such 
that 


j 8;(x) —1/ax? 
IMG) =e /2* tor « #0, 


and f0+Y(0) = lim 5 ve =; 
for all 7 € N. Thus p,(x) = 0 for all n, and so rz41(x) = f(x). In this case, 
the Taylor series gives us no useful information about f; the trouble is that 
f is too smooth at 0. 

Let us give two applications of Taylor’s theorem. Our first application is to 
the Newton—Raphson method of approximation. We consider a continuous 
function f on an interval [a,b] with the following properties: 

(i) f is twice differentiable on (a,b), and f” is bounded on (a,b): there 
exists M such that | f”(x)| < M for all x € (a,b); 

(ii) there exists m > 0 such that f’(x) > m for all x € (a,b); 

(iii) f(a) < 0 and f(b) > 0. 

Then f is strictly increasing on [a,b], and so, by the intermediate value 
theorem, there exist a unique c € (a,b) with f(c) = 0. The Newton—Raphson 
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method provides a sequence of successive approximations to c, and Tay- 
lor’s theorem tells us that the approximation can improve extremely rapidly. 
We must start with a reasonably good approximation x9. Let K = M/2m. 
Suppose that we have found 0 < h < K, a; and b, such that 


a<bj-h<ay <b) <a,+h< 6}, and such that f(a,) <0 < f(b1). 


Figure 7.6. The Newton—Raphson method. 


Let A = h/K, so that 0 < X < 1 (the smaller X is, the better the 
approximation will be). Then c € (a1, 6), and 


[a1, bi] C (c— h,c+h) C (a, D). 


Start by choosing xo € [a1,6;]. Then K|ap — c| < A. By Taylor’s theorem, 
with Lagrange’s remainder, there exists yo € (c, 29) such that 


0= f(c) = f(x) + (c — 20) f"(@o) + (e — 20)" f" (yo) /2. 


Hence 


F000) _ yy (emg pL) 
Fla) ~~~ 80) a Frag) 
We set 21 = Xo — f(20)/f’ (xo), so that 
~e=(—_p2 to) 
as 0) 2f"(x0)’ 


and so |z; —c| < Kh? = Ah. Thus 21 € (c—h,c+h), and K|x; — ¢| 
< )*. Iterating the process, we obtain a sequence (2)°% 9 such that 
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K|an — ¢| < d*". This can lead to very rapid convergence; for example, if 
K =1and \ = 1/10, then |x3 — ¢c| < 1/108. 

The proof of the existence of the nth root of a positive number that was 


given in Section 3.2 used the Newton—Raphson method. 
The classic application of Taylor’s theorem is to the binomial theorem. 


Theorem 7.6.4 (The binomial theorem) Suppose that a € R\N and that 
—-l<a<1. Let fa(x) =(1+2)%. Then 


jie)=1400 + +05) 2145°(%)o 
j=l 


the sum converging absolutely. 


Proof The proof is not quite straightforward. (It is unfortunate that Profes- 
sor James Moriarty’s treatise is not extant, as it would have thrown light on 
how the theorem was considered towards the end of the nineteenth century.) 
The ratio test shows that the series converges absolutely. Further, 


f(x) = a(a—1)...(a—j +1) +2) 


Thus 


We need to show that the remainder r,(xz) tends to 0 as n — oo. The 
Lagrange form of the remainder is 


ra(z) = (Ja+ on x)o "eg" = (1+ Ona)” (*) (5). 


where 0 < 6, < 1. If0 < # < 1 thensup,, |z/(1+6,2)| < 1, and so r,(x) — 0 
as n — oo (see Exercise 3.2.5). If -1 < x < —1/2, this argument does not 
work. 

Instead, we use Cauchy’s form of the remainder. Choose k > |a|. We find 
that 


rn(2) = Z(t Bn) (27) baa) 2", 


Since 1 — 0, < 14+ On27, it follows that ifn > a then 


roel s lage (naa) 


and so rn(x) + 0 as n — oo. 


b) 
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This is a remarkably technical proof. Another, easier, proof is given as an 
exercise in Section 8.7. We shall see in Volume III, that these proofs, and 
the proofs of other consequences of Taylor’s theorem, are superseded by the 


~Ol 


so that all the summands are positive. In tie 


1 1 ree (27 — 1) 2) 
(1—2«)1/2 =1+5 Fy glQ9 ne 


If f is differentiable at a, then f(x) — f(a) — (a — a) f'(a) = o(|x — a); for 
this, we do not need to suppose that f is differentiable at any point other 


complex version of Taylor’s theorem. 
Notice that ifa = —-6<0and0<2a<1 then 


1 


fi—a)e~ =(14+(- 


than a. There is a corresponding result for n-times differentiable functions; 
this is due to W. H. Young. (We shall not use this result later, and it may be 
omitted.) 


Theorem 7.6.5 Suppose that f is (n — 1)-times differentiable in an 
interval I and that f("-») is differentiable at an interior point a of I. Let 


ahs 
HO pray t+ FIM" por, 
and let rn4i(x) = f(x) — pn(x). Then rp4i(x) = o(|x — al”). 


Proof Let u(x) = rn4i(x)/(a — a)", for « # a. Then we must show that 
u(x) + 0 as x — a. Suppose that « > 0. Let 


(w a)" 


Ve(X) = Tn4i(z) + e(a — a)”. 


Then vu, (x) = (u(x) + €)(a— a)” for x # a, and v, is n-times differentiable at 
a; 
ve(a) = 0, v®)(a) = 0 for 1 < s <n—Jland v\")(a) = nle > 0. 


By Proposition 7.1.5 (v), there exists 6 > 0 such that [a,a+ 6) C I and 


ul” (2) > 0 fora < « < a+ 6. By Corollary 7.3.6, v{””) is strictly 


increasing on [a,a +6), and so us”) (x) > 0 fora <a < a+. Iterating 
the argument, it follows that v.(z) = (u(x) + 6)(v—a)" > Ofora<a< 
a+ 06. Thus u(z) > —e fora < x < a+6. Applying the same argument to 
We(x) = —rn4i(x) + €(x — a)”, it follows that there exists 6’ > 0 such that 
[a,a +0’) CT and u(x) < € fora < x < a+’. Consequently, u(x) > 0 as 
x \, a. Similarly u(x) > Oas a / a. 
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Exercises 


7.6.1 Suppose that f is differentiable in an open interval J, and that f is 
twice differentiable at a € I. Show that 


fla+h) + fla—h) — 2f(@) 
h2 


>Oash—0. 


(Hint: L’Hopital’s rule.] 

7.6.2 Suppose that f is 2k-times differentiable on an open interval J, that 
f2 (a) = 0 for 1 <7 < 2k—1 and that fP”) (a) < 0. Show that f has 
a local maximum at a. 

7.6.3 Suppose that f is twice differentiable on an open interval J, that 
a,b,c € I, with a < b < c. Show that there exists d € (a,b) such 
that 


7.6.4 Apply the Newton—Raphson method to the function f(x) = x? — 2, 
starting with 29 = 3/2, to obtain rational approximations to 2. How 
good is the approximation after three iterations? 

7.6.5 Let f(z) = log(1 +2), for -1 < x < 1, and let rn(x) be the nth 
remainder in the Taylor series expansion of f. Show that r,(x) — 0 
as n — oo, and determine the infinite Taylor series for f. 

7.6.6 Let f(2) = tan~!(x). Apply the Newton—Raphson method when 0 < 
|zo| < 1, when xp = 1 and when |z| > 1. When (z,,) converges, how 
fast does it converge? 

7.6.7 Suppose that f is a convex increasing function on the closed interval 
(a, b] which is differentiable on the open interval (a,b), and for which 
f(a) < 0 < f(b). Suppose that zp € (a,b) and that f(z) > 0. Show 
that the sequence (%7,)°29 defined by the Newton—Raphson method 
is decreasing, that x, > b and that f(z,) > 0. Suppose that x, — c. 
Show that f(c) = 0, and that there exists 0 < A < 1 such that 
Ln — C< X"(xXp — c). What happens if f(xo) < 0? 

7.6.8 Apply the Newton—Raphson method to the function f(x) = 2”, where 
n > 2, starting with zp > 0. Calculate z,. Why is the convergence 
slower than that described in the text? 

7.6.9 Apply the Newton—Raphson method to the function f(x) = x+a°*", 
where x > 0 and 0 < a <1, starting with rp > 0. Calculate z,. Why 
is the convergence slower than that described in the text? 
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7.6.10 Suppose that (a5) Foy is a sequence of positive terms, and that there 
exists a > 0 such that aj;41/a; = 1—a/j+7;, where r;/j — 0 as 
j — oo. Show that }°5°, a; converges if a > 1 and that )0°, a; 
diverges if a < 1. (Consider b; = 1/j*, where s is between 1 and a. 
This extends D’Alembert’s ratio test.) 


8 


Integration 


8.1 Elementary integrals 


We now turn to integration, which we develop as the ‘area under the curve’. 
We establish the existence and properties of the Riemann integral; this is 
an integral whose development is quite straightforward, and which is good 
for many of the needs of analysis. It has some shortcomings: it can only be 
applied to a restricted class of functions, and it is not easy to obtain good 
results about limits of integrals. For this, a more sophisticated integral, the 
Lebesgue integral, is needed; we shall consider this in Volume III. 

As with all theories of integration, we proceed by approximation. To begin 
with, we restrict attention to bounded real-valued functions on a finite inter- 
val [a, b]. The easiest functions to start with are the step functions — functions 
which take constant values v; on a finite set 1G :1< 7 < k} of disjoint 
sub-intervals of [a,b]. The graph of such a function is a bar graph, and we 
define the elementary integral of such a function to be Me oy ujl(;), where 
I(I;) is the length of the interval [;. Note that v; can be positive or negative, 
so that the integral can take positive and negative values. 

The idea of the Riemann integral of a function f is to approximate f from 
above and below by step functions. If the integrals of the approximations 
from above and from below approach a common limit, then we take this 
limit to be the Riemann integral of f. 

In order to carry out this programme, we need to set up the appropriate 
machinery. A dissection D of [a,b] is a finite subset of [a,b] which contains 
both a and b. We arrange the elements of D, the points of dissection of D, in 
increasing order: a = 1p < 41 <-+-+: < xy = 0. The dissection splits [a, b] into 
k; disjoint intervals 1,...I,. We need to decide what to do with the end- 
points; we adopt the convention that I) = [xo, 1] and that J; = (#j-1, 25] 
for2<j<k. 
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We order the dissections of [a,b] by inclusion: we say that D2 refines 
Dy, if Dy C Do, and write D, < Dp. This is a partial order on the set 
A of all dissections of [a,b], and A is a lattice: D; V D2 = Dj, U Dz and 
Dy, \ Dz = Di Dz. A has a least element {a,b}, but has no greatest 
element. 

Suppose that D is a dissection, with intervals ,...,[;. We denote the 
indicator function of I; by xj: x;(v) = lif a € Lj, and x;(x) = 0 otherwise. 
Similarly, we write xj») for the indicator function of [a,b]. We denote the 
linear span of {xj :1 <j <k} by Ep; thus a function f € Ep is of the form 
5 sy vj;Xj, Where v1,...,Uz are real numbers. The elements of E’p are 
the step functions on [a,b] whose points of discontinuity are contained in D; 
note that, according to our convention, step functions are continuous on the 
left. Ep is a k-dimensional vector space of functions. 

If Dz refines D1, then Ep, C Ep,, and so the set of spaces {Ep : D € A} 
also forms a lattice: 


Ep, A Ep, = Eps a shoe = Ep,aD» 
and Ep, V Ep, = span (Ep, U Ep,) = Ep,vp,.- 


The union Ea = U{Ep: D € A} is the infinite-dimensional vector space of 
all (left-continuous) step functions. 

We now wish to define the elementary integral of a step function f. If 
f= ee vjXj, we want to define ie f(x) dx to be Ss ujl(;), where 
l(I;) = %; — x;_1 is the length of J;. But the representation is not unique, 
and we need to show that the integral is well-defined. 


Proposition 8.1.1 Suppose that D and D’ are dissections of [a,b], and 
that f € EpN Ep, with representations f = De vjx;j and f = Be U;X5 


jal "5 
Then ya vjl(Lj) = ja vl). 


Proof We use the lattice property of A. Let D’” = DUD’. Let D = 
{royatsyay and DY = 4 aps 640 Then, there exist: 0. = fo-< ry << 
rp =k" such that 2; = 27, for 0 < j < k. Thus I(Jj) = bears I(T’). We 


‘a k”’ 
can write f = S7)_, uxt, where vj; = uy for rj_-1 < r < rj. Consequently, 


T; kl" 


k 
Del) =I (dD olay) | = Darl. 
} j r=1 
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” 


Similarly, as uli) = ye, (I), so that 


Yt = Loup 


We can therefore define the elementary integral as 


b k 
| P(x) de = S~ vjl(I)) 
a j=l 


Proposition 8.1.2 Suppose that f and g are step functions and that 
cER. Then f+ é na cf are oe functions, ae 
(i) [(F a) +9 ee )dx + f2 9(2) 
(ii) fef(x aay: f(x 
(aii) If f(x) < g(x) for es then tk f(a) dz < f° g(x) dx 


Proof (i) Since A is a lattice, there exists D € A, with intervals ,..., Iz, 
such that f,g € Ep. Then we can write f = a vjxj and g = yy WiXj: 
Then f+g= sa; + w;)xj is a step function and 


k 


b 
i; (He) + oe) d= Dos + as = Sots D+ Dom 


= af Havie. 


The proofs of (ii) and (iii) are just as easy, and are left as exercises for the 


reader. 


8.2 Upper and lower Riemann integrals 


We now consider a bounded function f on [a,b], with m < f(x) < M for all 
x € [a,b]. We try to integrate it by approximating from above and below by 
step functions. Let 


Us ={9:9 € Eg and g > f} 


be the set of step functions which are greater than or equal to f. Uf is 
non-empty, since Myjq4) € Us. If g € Uz, g = Mxjap], and so ie g(x) dx > 
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m(b — a). Thus the set {f° g(a) dx : g € Ur} is bounded below. We define 
the upper Riemann integral of f to be 


[re x) dx = intt [9 x) dx: g € Us}. 


Lp={h:he Ey andh< f} 


Similarly we set 


and define the lower Riemann integral of f to be 


[se )av=supt fra) Cdn iWwe Dey. 
Proposition 8.2.1 Suppose that f is a bounded function on [a,b]. Then 
Saf (x) dx < J f(x) de 


Proof IfheLy and g € Uf then h < f <g, so that 


[ Hedee | Deas 


Taking the supremum over Ly, we see that 


[1 dx < [oo dx, 


so that, taking the infimum over Uf, 


[se Jar <intt foto x) da : seUn= [sear 


Suppose that D is a dissection, with intervals ],...,J,, and that f is 
a bounded function on [a,b]. Let M(J;) = sup{f(x) : « € Jj}, and let 
Mp(f) = ee M;x;. Then Mp(f) is the least element of Ep 1 Uy = {g € 
Ep,g > f}. We set 


b 
Sp = Sof) = MUL) = fF Mp( f(x) dz 


j=l 
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Then Sp = inf{ J” g(a) dx: g € Uf Ep}, so that 


[re )ax = inttintt fo) ajde:geUs (1 Ep}: De A} 
=inf{Sp: De A}. 
Similarly, we define m(I;) = inf{ f(x) : x € Ij} and mp(f) = Nha My) xj, 
and set 
so = sol) = Dol y= ff mo(f 
Then sp = sup{[” g(2) dx: g € Ly Ep}, so that 


[50 de = supfsupt f 9) g)da:geUs Ep}: De A} 


=sup{sp: De A}. 


r 
N 


Figure 8.2. Upper and lower sums Sp and sp. 


Note that if D’ refines D then Sp, < Sp and sp: > sp. 

In fact, we do not need to consider all the dissections to determine the 
upper and lower Riemann integrals. If D is a dissection, with intervals 
Ii,...,I,, we define the mesh size 6(D) to be max{I(J;):1 <j < k}. 


Theorem 8.2.2 Suppose that (D;)°°, is a sequence of dissections of [a, b], 
and that 6(D,) + 0 as r — oo. If f is a bounded function on [a,b] then 
— f° f(a) dx as T > OO. 


Proof Suppose that « > 0. Then there exists a dissection D of [a,b], 
with points of dissection a = % < 21 < --: < av, = 5 such that 
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oS F(a) dx + €/2. The idea of the proof is to choose r large enough so 
that the set D is contained in a set of intervals of D, of small total length. Let 
n = €/2(k +1)(M —m-+1). There exists ro such that 6(D,) < 7 for r > ro. 
Suppose that r > ro. Let D’ = DV D,. Then Sp < Sp. Let {Ji,...,Jq} 
be the intervals of the dissection D,, and let Ky,...,K, be the intervals of 
the dissection D’. We divide {1,...,q} into two disjoint subsets. Let p € B 
if J, contains one or more elements of D, and let p € G otherwise. (B is the 
set of bad indices, and G is the set of good indices.) Then |B] < k +1. If 
p € B, then Jp is the disjoint union U,<s,K; of finitely many of intervals in 
D'. Since m < f(x) < M, 


MI(Jp) > M(JpU(Jp) > SD M(K,)I(K,) > m S2 UK) = ml). 
resp r€Sp 


If p € G, then J, = K, for some r € {1,...,8}, so that M(Jp) = M(K;). 
Thus 


Sp, — Spr = 2 | Mpl(Jp)l(Jp) — D> M(Kr IK) 


pEeB reSp 
< S°(M = m)l(Jp) < (M — m)(k +1)6(D,) < €/2. 
peEB 


Consequently, if r > ro then 


“Pb Pb 
[ fades Sp, < Sr +25 Spt e/2< f fle)de+e 


so that Sp, - f° F(a) dx as r — oo. 


We can for example take D, to be the dissection dividing [a, b] into r inter- 
vals of equal length. Alternatively, we can repeatedly bisect the intervals, so 


that D, is a dissection dividing |[a, b] into 2” intervals of equal length; in this 


(oe) 


case, (D,)?2 is an increasing sequence of dissections, so that (Sp,)r2j is a 


; : b 
decreasing sequence, converging to iB f(x) dx as r > ov. 


8.3 Riemann integrable functions 


We say that a bounded function f on [a,b] is Riemann integrable if its 
upper and lower integrals are equal. The common value is then the Riemann 
integral ie f(x) dx. In this expression, f is called the integrand. First, we 
must check that this extends the elementary integral of step functions. 
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Proposition 8.3.1 Jf f is a step function then it is Riemann integrable, 
and the Riemann integral is the same as the elementary integral. 


Proof Let E be the elementary integral. Since f is in both Uy and Ly, 


E< [se tee [re de <E, 


and so all the quantities are equal. 


Proposition 8.3.2 Suppose that f is a bounded function on [a,b]. Then 
f is Riemann integrable if and only if given € > 0 there exist step functions 
g andh withh< f <g and f° g(a) dx — f? h(a) dz <e. 


Proof This follows immediately from the definition. 


Proposition 8.3.3 Suppose that f is a bounded function on [a,b]. Then 
f is Riemann integrable if and only if given € > 0 there exists a dissection 
D such that Sp — sp < €. 


Proof The condition is clearly sufficient. If f is Riemann integrable and 
€ > 0 then there exist dissections D,; and D2 such that sp, + €/2 > 
J? f(a) dx > Sp, — €/2. Let D = D; V Do. Then 


Sp <Sp, < sp, +€<spt+e. 


We can express this proposition in terms of the oscillation of f. Sup- 
pose that f is a bounded real-valued function on a non-empty set S. The 
oscillation Q = Q(f, S') of f on S is defined as 


QF, 5) = sup{| f(s) — FM: s,¢ € S} = sup f(s) — inf f(s). 


ses 


Corollary 8.3.4 Suppose that f is a bounded function on [a,b]. Then f 
is Riemann integrable if and only if given « > 0 there exists a dissection 
D={a=2 <-+: < ap =} of [a,b], with intervals I,..., I, such that 


k 


SOF, Ty) (aj — 25-1) <. 


j=l 


Proof For Sp — sp = Sa Of, Lj) (aj — @j-1). 


Corollary 8.3.5 Suppose that f is a bounded function on [a,b]. Then f 
is Riemann integrable if and only if given € > 0 there exist a dissection 
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D= {a= <-:: < xp = b} of [a,b] and a partition GU B of {1,...,k} 
such that 
OF,i)<seforgeG and Solp <€, 
jeEB 
where I1,...I, are the intervals of the dissection. 


Proof Suppose that the condition is satisfied, and that « > 0. Let 7 = 
e/(b—a+Q(f, [a, b])). Then 


k 
> UF, G5) (aj — 23-1) = 
j=l 
= SCF, Lay — 2j-1) + S5 OF, G)) (ay - 25-1) 
jeG jeEB 
<< (sup Q(f, ii\) Sg — £;-1) + Q(F, [a, 8) DRC — £;-1) 


ft jeG jeB 
< (b—a)n + QF, [a, 8))n =e, 
so that f is Riemann integrable. Suppose conversely that f is Riemann 
integrable. By the previous corollary there exists a dissection D with 


k 


Ss" Q(f, 1;)(aj — aj-1) < min(e, €”). 


j=l 


LeteG ={7 2 D<0(f,4;) = ep and let B={7 ED: OCF, 1.) > e€}. Then 


eS 10) < SOF, HUG) < &, 


jeB jEB 


which give the result. 


Many important functions are Riemann integrable. 


Theorem 8.3.6 (i) A continuous function on |a, b] is Riemann integrable. 
(it) A monotonic function on [a,b] is Riemann integrable. 


Proof In both cases, we use Proposition 8.3.3. 

(i) We use the fact that f is uniformly continuous. Suppose that € > 0. 
There exists 6 > 0 such that if |x — y| < 6 then |f(x) — f(y)| < «/(b— a). 
Choose N so that (b—a)/N < 6, and let Dy be the dissection of {a, b] into 
N intervals ,,..., I of equal length. Then /(J;) = (b— a)/N < 6, so that 


M,= sup f(@)iee GE} <imt{ fe) tee Gye 6 =a) =m; e/O— ae), 
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for 1<j7< N, and so Sp, < sp, + €. 

(ii) Without loss of generality we can suppose that f is increasing. Suppose 
that « > 0. Choose N so that N > (f(b) — f(a))(b — a)/e. Let Dy be 
the dissection of [a,b] into N intervals of equal length, as before, and let 
a= 2% <2 <-+::<ay =b be the points of dissection. Then m; > f(xj-1) 
and M; = f(a;), so that 


a = (Bs ay @ SN @&/, 1 
m= D(H) == F (t+) 
and 

2 


=D (%5) (§) ae LU-9=F (x), 


j=l 


from which it follows that fj x dx = a/2. 
We can also characterize Riemann integrability in terms of a sequence of 
dissections. 


Proposition 8.3.7 Suppose that (D,)°2, is a sequence of dissections of 
[a,b], and that 6(D,;) > 0 as r > oo. If f is a bounded function on [a,b], 
then f is Riemann integrable if and only if Sp, — sp, ~ 0 as r — oo. If so, 
then if i (Ss) do = litig- 34, Oy, = iy see Se 


Proof This follows immediately from the definition and Theorem 
8.2.2. 


Corollary 8.3.8 Suppose that f is a bounded function on [a,b], and that 

JER. Then the following are equivalent. 

(i) f is Riemann integrable, and fi F(a) de = J. 

(it) If (D,)°2, is a sequence of dissections of |a, b] with 6(D,) + 0 as r — oo, 
if Ip1,+++;Lrg, are the intervals of the dissection D,, and tf yrp € Trp; 
for 1<p<@p, then 


qr 


SS FYrg)tUrp) — Jasr— o. 
p=1 
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It is important that (ii) must hold for every choice of y;» € Jy», and not 
just for one particular choice. 


Proof If f is Riemann integrable, and ih f(x) dx = J, then 


qr 
8D, S y f(Yrp) trp) S Sp, 


p=1 


and so (ii) follows from the proposition and the sandwich principle. 

Conversely, suppose that (ii) holds. For each r € N and each p with 1 < 
p < qr there exist y,,, and 2,» in I,,, for which f (¥-p)—f(zrp) = Q(F, Lp) /2- 
Then 


O< > OF trp tlre) S 2 S(Funp) = F(2e5) ies): 
p=1 


But 


and so 


Gr 
s OF lem pp) = Oas 7 = 00: 
p=1 

Thus f is Riemann integrable, by Corollary 8.3.4. Further, since 


qr 
sp, < >~ f(yrp)lUnp) < Sp,; 
p=1 


it follows from the sandwich principle that 


qr b 
J= jim [> Furry) =f fe) ax. 
p=l . 


Let us consider some examples. 
Example 8.3.9 A bounded function which is not Riemann integrable. 


Let f(x) = 1 if x is rational, and f(x) = 0 if x is irrational. If g = 
a vjxj © U; then each J; contains a rational number, and so v; > 1. Thus 
f? g(a) dx > b—a, and so ff) dx > b—a. Since xja,9 € Us, f(a) = 
b—a. Similarly, If h = ae w;x; € Ly then each J; contains an irrational 
number, and so w; < 0. Thus hs h(x) dx < 0, and so F(a) dx < 0. Since 


Oc Ly, ff) dx = 0. Thus f is not Riemann integrable. 
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Example 8.3.10 A Riemann integrable function on [0,1] which is discon- 
tinuous at the rational points of [0, 1]. 


Ifr € [0, 1] is rational, and r = p/q in lowest terms, let g(r) = 1/q, and if x 
is irrational, let g(a) = 0. Then g is discontinuous at every rational number. 
Suppose that « > 0. Then there exists gg such that 1/q9 < ¢. Then in a 
closed interval [a,b] there are only finitely many rational numbers r = p/q 
with q < qo, so that L = {x € [a,}] : g(a) > €} is finite. We can include L in 
a finite set of intervals of total length less than e: there exists a dissection 


D={0 =a =< te <i Sh et ae eH 1} 


such that 

LC [x0, yo] U (1, yi] U +++ U (x, yal, 
and YF Uk — Zk < e. If we take G = {2,5 :1<i< k} and B= {y;: 
O <i < k} then Q(g,1;) < € for j € G, and >) ep Ij) < €, so that g is 
Riemann integrable, by Corollary 8.3.5. Further, Sp < e(b— a) + €, so that 
f? g(a) dx < 0. Since g is non-negative, f? g(a) asx = 0, 


Example 8.3.11 <A function which is constant on a dense open subset of 
[0,1], but which is not Riemann integrable. 


Let C be a fat Cantor set. C is a perfect subset of [0,1] with empty 
interior. Let Icy be the indicator function of CO. Then Igy) is zero on 
the dense open subset [0,1] \ C of [0,1]. Since C© has an empty interior, 
fotow(z) dx = 0. On the other hand, if D is a dissection of [0,1], with 
intervals I1,...,J,, and if G= {j: 1; nC = 6}, then Vyjeg (Uj) < €, and 


so Spa) > 1—e. Thus fotow(a) dx >1-e. 


Exercises 


8.3.1 Suppose that f is a bounded function on [a,b] which is continu- 
ous except at finitely many points of [a,b]. Show that f is Riemann 
integrable. 

8.3.2 Suppose that f is a Riemann integrable function on [a,b]. Suppose 
that « > 0. Show that there exist a < ay < b; < b such that 


sup{ f(x) : x € [a1, bi]} — inf{ f(x) : x € [a1, bi]} < e. 


Show that f has a point of continuity in [a, }]. 

Suppose that f(x) > 0 for all x € [a,b]. Show that if f(x) dx > 0. 
8.3.3 Suppose that f is an integrable function on [a,b] and that ¢ is uni- 

formly continuous on f([a,6]). Show that go f is Riemann integrable. 
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8.3.4 Suppose that f is a bounded on [a,b]. Show that 


[re f(x) dx = int{ 9) x) dx: g continuous, g > f}. 


8.3.5 Suppose that f is a bounded increasing function on {a, b]. Show that 
hh (x) dx = 


inf { [oo x) dx : g continuous and strictly increasing, g > f}. 


8.4 Algebraic properties of the Riemann integral 


Here are some straightforward results about upper and Riemann integrals. 


Proposition 8.4.1 Suppose that f and g are bounded functions on [a,b], 
and that c > 0. 


@ [re toede> f peyae+ [aera 

(in frteysaeyars fpayde+ falepas 

a [en oie ) de ond festa) hom 

Gi)’ Ceoyar =f 1e)ae ona sey ae=— [roan 


(v) If f(x) < g(x) for all x € [a,b] then 


[1 dx < [um dx and [re dx < [9 de: 


Proof IfheLs andke Ly thenh+k € Ly+,. Thus 


[reoratoae> [re +mear= [nar [me a 


Taking the suprema over Ly and L,, we obtain the first result. The rest are 


just as easy. 
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Corollary 8.4.2 (i) If f and g are Riemann integrable and c € R then 
ft+g andcf are Riemann integrable, and 


‘A VOLE i. rot : fo 


[er ie = ef so ag. 


(ii) If f(x) < g(x) for all x € [a,b] then i f(a) dz < f? g(a) ax. 
Proof (i) We have 


and so they are all equal. Scalar multiplication is even easier. 
(ii) This follows directly from (v). 


When / is continuous, we can say more. 


Proposition 8.4.3 Suppose that f is a non-negative continuous function 
on [a,b] and that fi f(x) dx =0. Then f(x) =0 for all x € [a,b]. 


Proof Suppose not, and suppose that f(c) > 0 for some c € [a,b]. There 
exists 6 > 0 such that if x € (c—6,c+4)N [a,b] then | f(x) — f(c)| < f(c)/2. 
Choose max(a,c — 6) < 41 < xq < min(b,c+ 0). Then f(x) > f(c)/2 for 
x € (x1, 22]. Let h(x) = (f(c)/2)xX(a,,2,)- Then f(x) = h(x) for all x € [a, 8], 
so that 


b b 
[ t@ace | h(x) dx = (xq — 21) f (c)/2 > 0. 


Corollary 8.4.4 Suppose that f and g are continuous functions on [a,b 
and that f(x) > g(x) for all x € [a,b]. If ae feics= f° g(x) dx, then 
f(x) = g(a) for all x € [a,b]. 


Proof The function h = f — g is continuous and non-negative, and 
ie h(x) dx = 0. Thus f(x) — g(x) = 0 for all x € [a, B. 
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Recall that if f is a real-valued function on a set S$ then ft(s) = (f(s))t = 
max(f(s),0), f(s) = (f(s))~ = max(—f(s),0) and |f|(s) = |f(s)]. 


Theorem 8.4.5 Suppose that f and g are Riemann integrable functions 
on [a,b]. Then f*,f-, \|f|, f? and fg are all Riemann integrable. 


Proof We use Corollary 8.3.5. Since Q(ft,D) < Q(f, J), it follows from 
Corollary 8.3.5 that f+ is Riemann integrable. Similarly, f~ 
integrable, and so therefore is |f| = f* + f-. 

Next we consider f?. Let M = sup{f(x) : x € [a,b]}. Suppose that 
€ > 0. Let 7 = €/(2M +1). By Corollary 8.3.5, there exist a dissection D = 
{a = 2% <-+-: < ty = 5} of [a,b] and a partition GU B of {1,...,k} such 
that 


is Riemann 


Qf, Tj))<nforjeG and SY oUt) <n, 
jeB 
where [),...J, are the intervals of the dissection. Then )/j-pl(Ij) < «¢. 
Since |s? — ¢?| = |s +¢|.|s —¢l, it follows that Q(f?, Ij) < 2M7n < « for j € G, 
and so f? is Riemann integrable. 
Finally, since fg = s((f +9)? — f?—g?), fg is Riemann integrable. [This 
last trick is called polarization.| 


Corollary 8.4.6 If f is Riemann integrable on |a,b| then 


[5 dx 


Proof For —|f| << f <|fl: 


< [ \stolae. 


Exercises 


8.4.1 Give an example of a function on [0,1] which is not Riemann 
integrable, but for which |f| and f? are Riemann integrable. 

8.4.2 Suppose that f and g Riemann integrable on |a, b]. By considering the 
function (f + Ag)’, for suitable A, or otherwise, establish Schwarz’s 
inequality: 


/ " fla)g(a) de < (/ “flay? is) (/ “g(2) ax) 


8.4.3 Suppose that f and g are Riemann integrable on [a, b], and that p and q 


1/2 


are conjugate indices. Show that |f|? and |g|? are Riemann integrable, 
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and establish Holder’s inequality for integrals: 


b b 
| l f(a)g(«) de| < / If (0)g(2)| dex 


<(/ fa? ae) = ( o(v)|* ar) i. 


8.5 The fundamental theorem of calculus 


We have introduced the Riemann integral as the measure of an area under 
a curve. It also acts as the inverse of differentiation. 


Proposition 8.5.1 Suppose that f is a bounded function on [a,b] and that 
a<c<b. Then f is Riemann integrable on [a,b] if and only if it is Riemann 
integrable on [a,c] and [c, b]. If so iM f(x) da = f° f(a) dx+ ie Taye. 


Proof This is an easy exercise for the reader. 


If a < b and f is Riemann integrable on [a,b], we write Ae feds = 
- ft f(x) dx. Thus the formula above can also be written as [* f(x) dx = 
f° f(z) dx + LI@ dx. 


Theorem 8.5.2 (The fundamental theorem of calculus) (i) Suppose that 
f is Riemann integrable on [a,b]. Set F(t) = ts f(a) dz, fora<t<b. F is 
continuous on [a,b]. If f is continuous at t then F is differentiable at t, and 
F’(t) = f(t). (ft =a or b, then F has a one-sided derivative.) 

(it) Suppose that f is differentiable on |a,b] (with one sided derivatives 
at a and b). If f’ is Riemann integrable then f(x) = f(a) + J” f'(t) dt for 
axa<b. 


Proof (i) The function f is bounded, and so there exists M such that 
| f(x)| < M for all x € [a,b]. Then ifa<t<s<b, 


F(t) - F(s)| = | [ soae 


< fis a)ldr < Mis -2), 


from which it follows that F is continuous. 
Suppose that f is continuous at t. Suppose that « > 0. There exists 6 > 0 
such that if |s — t| < 6 and s € [a, 6] then | f(s) — f(t)| < «. Now 


[ fe)- tar =F F(t) — f()(s— 8), 
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so that if 0 < |s —t| <6 and s € [a,b] then 


IF(s) — F(t) — f()(s— | < | [1- 


< fit@)- Wide Seleeoal 


since |x — t| < |s —t| < 6 for x € [t,s]. Thus F is differentiable at t, with 
derivative f(t). 

(ii) Let (D,)°2, be a sequence of dissections of [a,x], with 6(D,) — 0. 
Suppose that D, has points of dissection a = 29 < +--+ < @p, = x. By 
the mean-value theorem, for each 1 < j < k, there exists yj € [%r,j-1, Lr,j] 
such that f(@,,5) — f(@r,j-1) = f"(Yr,j) (rj — @rj-1). Thus 


> 


r k,. 


F' (Ung )(@rj — &rj—1) = (fF (@rg) — F(@rg-1)) = F(a) - F(a). 


1 j=l 


J 


The result now follows from Corollary 8.3.8. 


Thus if f is a continuous function on [a, b], the integral enables us to solve 
the first-order differential equation F’(x) = f(x), with boundary condition 
F(a) = 0. Any function F' which satisfies F’ = f is called a primitive, or 
anti-derivative, of f. If F and G are primitives of f, then (F — G)'(xr) = 
F'(x) — G(x) = 0 on [a,b], and so F = G +c, where c is a constant. 

It is important to note that in Part (ii) of the theorem, we require f’ to be 
Riemann integrable. In general, the primitive of a function is well-behaved, 
whereas the derivative, where it exists, need not be. The fundamental theo- 
rem of calculus allows us to calculate many integrals without difficulty. Here 
are some examples. 


1. Suppose that a < b, that k 40 and that c > 0. Then 


b b kt 
d fe 1 
ka = — —(ekb _ pka 
fe wf a(G)¢ RM ee 
b b t b a 
d Cc Ce 
/ om [ dt (=) : loge ’ 


[ocosadr= [4 — sint dt = sinb— sina, 
a 


&.5 The fundamental theorem of calculus 225 


b b d 
Dg ail 008t) dt = cos a — cos b, 


[< -[¢ — tan !tdt =tan7!b—tan7!a. 
q lta? 
2. If0<a< bd then 


b b 

b 
[s-] eset ae ideh = eee lee 
Bre a PE a 


3. If-1<a<b<1 then 

rd 1 1 1 1 1 

————— —sin tdt=sin”-~b—sin.-a=cos  a-—cos 0. 
a ie 


We use the fundamental theorem of calculus to obtain the following 
change of variables formula. 


Theorem 8.5.3 Suppose that g is a differentiable increasing function on 
[a,b], that g’ is Riemann integrable and that f is continuous on [g(a), g(b)]. 


Then 
g(b) b 
i) fly) dy = i flg(x)) g(a) dx 
g(a) a 


Proof Let F be a primitive for f, so that F' is continuously differen- 
tiable on [g(a), g(b)]. Thus F'o g is differentiable on [a, 6], and its derivative 
F"(g(x))g'(x) = f(g(a))g’(x) is Riemann integrable. Hence 


The next result, which is occasionally useful, concerns certain infinite 
Taylor series. 


Proposition 8.5.4 Suppose that f is an infinitely differentiable function 
on [a,b) (with one-sided derivatives at a) and that f(a) > 0 for alln€N. 
Suppose that the Taylor series 


SFT 


Ms 


= 
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for f'(x) converges to f'(x) for each x € (a,b). Then 


© eA(g | 
fe) =F + WE Oe - a 


for each x € (a,b). If f is bounded then f(b—) = lim 7» f(x) exists, and 


Ifa<t< <2 then 


0< f(t) — s(t) = 3 pal Tae 
jg=nt+1 J: 


so that 
0< f(t) —tmar(e) = il “(f'(B) — salt) dt < (@ — a)(f"(2) — sn()). 


Since s,(x) > f(x) as n > ov, it follows that un(x) — f(x) as n — oo. 
Since f’(x) > 0 for x € [a,b), f is an increasing function, so that if f 

is bounded then f(b—) = lim, 7» f(a) exists. The sequence (un(b))?-; 

increasing. Since each of the polynomials u,, is continuous, it follows that 


f(b—) = un(b), for n € N. But 


is 


f(b-) = sup f(x) = sup sup un(x) < sup un(d), 
ax<a<b a<a<bnEeN neN 


and so u,(b) > f(b—) as n > ov. 


Here is a particular application. 
Corollary 8.5.5 Jf0<a< 1 then 


g2itl 


| 2j 
a wart ea (27 +1) 


Proof 


8.5.1 


8.5.2 


8.5.3 


8.5.4 


8.5.5 


8.5.6 


8.5.7 
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For 


Gh ashe = (27 \ 27 
ip g=(1-2*) = 5 ) DBF 


Exercises 


Suppose that f is a Riemann integrable function on [a,b]. Show that 
the set of points of continuity of f is dense in [a, }]. 

It is not easy to give conditions on a differentiable function f to ensure 
that f’ is Riemann integrable. One sufficient condition is that f’ is 
continuous. Use part (i) of the fundamental theorem of calculus to 
prove part (ii), in the case where f’ is a continuous function on [a, }}. 
Let 


"d 
ity) = ee ee A, 
1 Ft 


Show that L is a continuous strictly increasing mapping of (0, co) onto 
R. Let FE be the inverse function. Taking these as the definitions of 
the logarithmic and exponential functions, establish the basic results 
(i)-(x) of Section 7.4. 

Suppose that f is a continuous convex function on [a, 6]. Show that 


b b 
(6) - f(a) = f fer)ae =f f(@-)ae. 


Suppose that f is a continuous periodic function on R, with period 
1. Suppose that a > 0. Show that iis f(a +a) — f(x) dx = 0. Deduce 
that there exist 0 < x1 < x2 < 1 such that f(z; + a) = f(x1) and 
f(zo+ a) = f(2x2). 

Show that if -1 <a <1 then 


faa faa (1)? "a" x (—t?)” 
tan7 (x) = sey t 
an (=a gts + mol +f ine ™ 

and show that 
: oo gent 
tan = —1)"—____.. 
ana) = 9-1" Go 
n=0 


Why is this the Taylor series expansion of tan! (x)? 
Show that tan(2x) = 2 tan(x)/(1 — tan?(x)). Deduce that 


m/4 = 4tan~1(1/5) — tan~*(1/239). 
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Use this to calculate 7 to five decimal places. 


[4/239 = 0.001673640....] 


8.6 Some mean-value theorems 


Here is an easy mean-value theorem. 


Theorem 8.6.1 Suppose that f is a continuous function on [a,b]. Then 
there exists a<c <b such that He f(z) dz = (b—-a)f(c). 


Proof Let F(t) = ie f(z) dx, fora <t <b. Then i f(a) dx = F(b)— F(a) 
and F’(t) = f(t), and so the result follows from the mean-value theorem. 


The proof of the next theorem is considerably harder; it is similar to the 
proof of Dirichlet’s test. 


Theorem 8.6.2 (Bonnet’s mean-value theorem) Suppose that f is a 


Riemann integrable function on |a, b] and that ¢ is a decreasing non-negative 
function on [a,b]. Then there exists a < c < b such that 


3 “(0 (a) de = 6(0) f° F(a) ae 


Proof Let F(c =i he f(x) dx for a < c < b, and let 
A =sup{F(t) :t € [a,b]}, A = inf{F(t) : t € [a, b)}. 


Since F is a continuous function on {a, }], it is sufficient, by the intermediate 
value theorem, to show that 


stonrs [reo ) de < p(a)A. 


Suppose that « > 0. There exists a dissection D, with points of dissection 
a=2X0 <--:< a2, = 6b, such that Sp(f) < sp(f) +€ and 


(2) f(a) de — € < $7 6(aj-1) flaj—1)(0j — 75-1) < fo (2) f(w) de +e. 
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Let a; = f(#j-1)(@j — 2j-1), and let bj = a1 +--+ +,, for 1 <7 < k; let 
bo = 0. Then 


k k 
S © b(@j—1) f (@j—-1) (#5 — @j-1) = 5 b(aj-1) 05 = S/ b(@j-1)(bj — 05-1) 


j=l j= 


= )_(o(aj-1) — $(@j))dj + P(TR-1)de- 


@ J 
b; =< So Mi(f)(ai _ Se) < Demilf) (wi _ 4) +ée< F(2;) te<At+e. 


al 
b(a)(A —€) = )_(b(aj-1) — O(25))(A — ©) + O(@p-1) A — ©) 
j=l 
a 
< )_(G(2j-1) — O(xj))d; + O(@p-1)d; 
j=l 


k 
= S~b(j-1) f(2j-1) (aj - 25-1) < foe) FOOe Ca ees 


j=l 
A similar argument shows that 
)(A + €) > fo (x) f(x) dx — «, 


and so 
o(a) a-o-es foal x) dx < o(a)(A +e) +¢ 


Since this holds for all € > 0, the result follows. 


There is also a version for increasing functions. 


230 Integration 


Corollary 8.6.3 Suppose that f is a Riemann integrable function on [a, b| 
and that ¢ is an increasing non-negative function on [a,b]. Then there exists 


a<c<b such that 
b b 
‘i O(a) f(a) dx = 6(0) | Pla) de 


Proof Consider the function g on [—b, —a] defined by g(x) = f(—2). 


We can also drop the non-negativity condition. 


Corollary 8.6.4 (Du Bois-Reymond’s mean-value theorem) Suppose that 
f is a Riemann integrable function on [a,b] and that w is a monotonic 
function on [a,b]. Then there exists a < c < b such that 


i: w(x) f (x) dx = (a) i f(x) dx + (b) i f(x) da 


Proof If w is decreasing, set (x) = u(x) — w(b). If w is increasing, set 
d(x) = —¥(x) + ¥(0). 


Exercises 


8.6.1 Suppose that 0 < a < b. Show that 
b_: 
ih sin u au Bes 
es art 
8.6.2 Suppose that 0 < a < 6 and that Kk > 0. Show that 


b ‘* 
| sin ku au se 
e U 


8.6.3 Suppose that 0<s<t< 7/2 and that K > 0. Show that 


ff ae 


21 sin u 


8.6.4 Suppose that ¢ is a non-negative increasing function on (0, ¢t], where 
0<t< 7/2, that d(u) > 0 as u_0, and that K > 0. Show that 


2 f Hedsnd a <o, 


sin U 
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8.7 Integration by parts 


Let us apply the fundamental theorem of calculus to the product of two 
functions. 


Theorem 8.7.1 (Integration by parts) Suppose that f is continuous on 
[a,b] that g is continuous and differentiable on [a,b], and that g' is Riemann 
integrable. Let F' be a primitive for f. Then 


b b 
/ f(x)g() dx = F(b)g(b) — F(a)g(a) -| F(ax)g'(z) de. 
Proof Since F’ = f, the function F(x)g(x) has derivative f(x)g(x) + 


F(x)g'(x), which is Riemann integrable. Applying the fundamental theorem 
of calculus, 


b 
F(b)g(b) = F(a)g(a) + / (F(a)g(x))' dex 


b b 
= F(a)g(a) + / f(a)g(e) de + | F(2)g'(2) de, 


from which the result follows. 


The difference F(b)g(b) — F(a)g(a) is frequently written as [F(a)9(a)]°. 
As an example, let us calculate 1 xsinx dz, where a > 0. Set f(x) = 
and g(x) = x. Then we can take F(x) = —cosz, and so 


sin x 


TT a 
| rsin ede = (~cosa).a~(-1).0+ [ cos x dx = sina — acosa. 
0 0 


Although Theorem 8.7.1 is very easy, it is also extremely powerful. It 
provides a continuous analogue of the argument used to establish Dirichlet’s 
test. To illustrate this, let us use the integration by parts formula to prove a 
version of Bonnet’s mean-value theorem (where rather stronger conditions 
are imposed). 

Theorem 8.7.2 Suppose that f is a Riemann integrable function on [a, | 
and that @ is a decreasing, continuous, differentiable, non-negative function 


on [a,b], and that ¢! is Riemann integrable. Then there exists a <c <b such 
that 


[ eerste) ae = (0) f° Ha) 


Proof Let F(c) = {© f(x) dx for a < c < b, and let A = sup{F(t) : t € 
[a, b]}, A = inf{ F(t) : t € [a, b]}. As in Theorem 8.6.2, since F' is a continuous 
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function on [a, bj, it is sufficient, by the intermediate value theorem, to show 
that 


b 
alayr< f f(a\ola)de < ola), 
Integrating by parts, 


b 


b 
[ solo) ae = FO) — | P)6'(o)ae. 


Thus, since ¢’ < 0, 


b b 
A9(a) = r4(B) — A / !(«) dx < F(b)4(b) - i F(x)¢! (x) de 


b 
< Ag(b) — A i, d!(a) dx = Aga). 


Integration by parts enables us to give another version of Taylor’s theorem 
with remainder. 


Theorem 8.7.3 (Taylor’s theorem with integral remainder) Suppose that 
f is k times continuously differentiable on [a,b]. Then 


n-1 ; 
f(0) = f(a) + P= 
j=l 


b 
f(a)+ ae f = ay LF (0) ae 


Proof By induction on n. It is true for n = 0, by the fundamental theo- 
rem of calculus. Suppose that it is true for n, and that f is (n + 1)-times 
continuously differentiable. Then it is n times differentiable, and so by the 
inductive hypothesis 


n—-1 (b _ 
FO) = Fa +s 
j=l 


a)? 945) 1 / iy — gyno} pn) 
f(a) + Caaf (b— a)” f(x) dx. 
Now —(b—2)"/n is a primitive for (b— x)”~", and so, integrating by parts, 


| “(b= af) de = 


1 (eta) (fee. 
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Thus 
aay f O- 2 Me) ae = 
Fo ayrs(a) 5 Po —ayr pr (a) ae 
and so 
F(0) = fla) + 2 CH (0) + AG [O- a4) dr 


This form of Taylor’s theorem differs from the earlier ones, in that it gives 
the remainder r,,(b) explicitly as a function of b. 


Exercises 


8.7.1 Show that 


n/2 qT\n-1 n/2 5 
ib v sine dx =n (4) = n(n=1) | xv” “sina dx for n> 2. 
0 0 
8.7.2 Suppose that f is continuous and differentiable on [a,b], and that f’ 
is Riemann integrable. Show that 


b b 
[ te@ar= 0-070) - f eas ae 


Suppose that m,n € N and that a < m,n < b. Establish Euler’s 
summation formula: 


Ss" 60) =f" saya fe = tel) se) ae 


j=m+1 m 


(Here [z] is the integral part of x; the least integer not greater than 
8.7.3 Use Taylor’s theorem with integral remainder to give another proof of 
the binomial theorem. 


8.8 Improper integrals and singular integrals 


So far, we have considered integrals of bounded functions defined on a 
bounded closed interval. How can we deal with unbounded functions, or dif- 
ferent sorts of interval? There are various limiting processes that we can use; 
the resulting integrals are called improper integrals and singular integrals. 
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First we consider a function f which is defined on a semi-infinite interval 
[a, 00) and whose restriction to each finite subinterval [a, b] is Riemann inte- 
erable. If ie f(x) dx tends to a limit | as b — co, we set | = i ag f(x) dx; this 
is an improper integral. It may also happen that [ . f(x) dx — co as b > o; 
in this case we write Ge f(x) dx = oo. For example, if f is non-negative, then 
the function I(b) = ii f(x) dz is an increasing function on [a,oo), and so 
either J(b) tends to a finite limit as b > oo, which is the integral i FAL).A2; 
or I(b) + 00 as b ov, in which case f™ f(x) dx = 00. 

Let us give some examples. 

First, 


oo b 
| (1+2?)"1dx= lim | (1+ 27)7! da = lim tan71(b) = 7/2. 
Secondly, the function sinc = (sinx)/x is an important function in the 
theory of signal processing. What can we say about fee sinc «dx? There is 
no problem at 0, since sinc x > 1 as x — 0. Let 


i i sinc x dx =) PEA es 
(n—1)7 (n-1)r # 
Then Jy > Ig >---, and I, ~ 0 as n — oo. By the alternating series test, 
the limit 
nt n a) 
‘ < oi _14)\5+1 7. 
jim : sinc x dx Jim 2 i aaa 
J= 


exists. Further (ie sinc x dx — 0 as b > ov, so that 


lee) b 
i sinc xdxz = lim sinc x dx 
0 


boo Jo 


exists. But note that 


(et Bye e 
\In| > i |sinc ra| dx > —, 
(n—2/3)m 6n 


so that f>° |sine x| dx = oo. 

This last phenomenon shows that we must proceed with some care. For 
example, let f(a) = (sinx)/./z. Then arguments just like those for sinc 
show that the improper integral [}° f(«) dx exists. But 


b b ai? 
lim , f?(x)dx = lim iat, OO, 
0 


b—0o boo Jo x 
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nT siscD) nt 
/ ne eee Ra ee eee 
(n-1)r  & Om J (n—1)0 2n 


and )t—1(1/2n) = 


Theorem 8.8.1 (The integral test) Suppose that f is a decreasing non- 
negative function on |0,00) and that f(x) + 0 as x > oo. 

(i) The series LO f(j) converges if and only if limp ih f(x) dx 
exists, and then 


since 


Zi see es 9G) 
(ii) IF 
n-1 n 
on= S70) - | fle)ae ond Dn => fG)—- | Fee 
I~ yr0-f 


so that Cy, < Dn, then (Cp)? is an increasing sequence and (Dn)°°, is a 
decreasing sequence, and the two sequences converge to a common limit G. 


Proof (i) Let us set g(x) = f(|x]) and h(x) = f(|x| + 1), so that h < 
f <g. Now 


[fo seayae = 16) ana [ n(x) ae =F), 
0 a0 0 Fai 
and so 
n-1 
Ys ae f(a) de < Sof). 
j=0 


Thus either the sum a the integral both converge, or they both diverge. 
(ii) Since 


n+1 
Crsi-Cn= fo fn) = fle)de > 0, 
n 
(C,)°2, is an increasing sequence; similarly 


n+l 
Deg: = FG PW =F@E<Y: 


so that (D,,)°2, is a decreasing sequence. Finally, D,, — C, = f(n), so that 
C, and D, converge to a common limit G. 


236 Integration 


If we set f(x) = log(1 +x), we see that )0%_,(1/7) — log(m + 1) increases 
to a constant y and that ja (1/9) —logn decreases to y, where 0 < 7 < 1. 
The number y is called Euler’s constant; its value is 0.577---. It is not 
known if ¥ is rational, or if it is algebraic; but every instinct suggests that 
it is transcendental. It is sometimes called Mascheroni’s constant, since in 
1790 Mascheroni calculated it to 32 decimal places, although in fact only 
the first 19 were correct; in 1878, J.C. Adams calculated it to 260 decimal 
places. With the use of computers, 7 is now known to 10!° decimal places. 

As another example, let us consider ay = 1/(nlogn), for n > 2. Consider 
f(x) =1/(xloga), for x > e. Then 


oe 0 
i — log(log n) — oo, 
- «logz 


as n — oo, and so )>°°., 1/(nlogn) diverges. [Of course no harm is done by 
starting at 2, and at e.] 

As a second example of an improper integral, let us consider a function 
f which is defined on R and whose restriction to each finite subinterval 
[a,b] is Riemann integrable. There are then two ways of proceeding. First 
we may require that the two improper integrals 1 f(x) dx and jae f(x) dx 
(defined in the obvious way) both exist, and then define i hae f(x) dx to be 
their sum. In this case, we again call the resulting value the improper integral. 
Alternatively, we may simply require that limp_.., Uk f(x) dx exists. In this 
case, the limit is called the Cauchy principal value of the integral or the 
singular integral of the function f, and denote it by (PV) f°. f(x) dx. For 
example, if f(x) = x/(1+a?) then [5° f(«) dx = oo and pes f(x) dx = —co, 
so that the improper integral does not exist. On the other hand, f is an odd 
function, and so the Cauchy principal value is 0. Great caution is needed in 
handling singular integrals. 

Next, it may happen that f is defined on an interval (a, b], that f is not 
bounded, but that f is bounded and Riemann integrable on every interval 
[@; 6), tora 6 <b TE L f(x) dx tends to a limit | as c \, a, then we set 
— ee f(x) dz; this is a singular integral. For example, let f(x) = x°7!, 
where x € (0, 1] and 0 < a < 1. This is unbounded as x \, 0, but 


1 1 
/ #@ar= tim, fv) de = lim(d = €*) fo = Aa 


It may also happen that f is defined on a set [a,b] \ {c}, where c is an 
interior point of the interval [a,b], and that f is bounded and Riemann 
integrable on each of the intervals [a,d] (with a < d < c) and [e,}] (with 
c<e<)), while f is unbounded. 
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Again we can proceed in two ways. First we can require that the two 
improper integrals {° f(a) dx and fe f(x) dx both exist, and then define the 


improper integral ie f(a) dx to be their sum. Alternatively if 


c-E b 
lim ( | fla)de+ f f(2) dr) 


then we again define the limit to be the Cauchy principal value of the 
integral. or the singular integral of the function f, and denote it by 
(PV) Lf? tl f(x) dx. For example if f(x) = 1/z on [-1, 1] \ {0} then the singu- 
lar integral ie x) dx = oo, and the improper integral hae f(x) dx does not 
exist, whereas i ae integral does exist, with value 0. 

It is also possible to consider multiple singularities, and to con- 
sider improper singular integrals. In each case, ‘caution’ should be your 
watchword: results for Riemann integrable functions do not always extend 
to improper integrals and singular integrals, and each case should be treated 
on its merits. 

Finally, let us mention that we can extend all these results to complex- 
valued functions; we simply consider, and integrate, the real and imaginary 
parts separately. 


Exercises 


8.8.1 Suppose that f is a real-valued function defined on [0,00) which is 
Riemann integrable on [0,6] for each 0 < b < ow. f is said to be 
absolutely integrable if the improper integral [5° |f()| dx exists and 
is finite. = FL that if f is absolutely integrable then m improper 
integral ey x) dx exists. [If the improper integral jet x) dx exists, 
but f is not ponte integrable, then f is said to be aa 
integrable.| (As with sequences, absolutely integrable functions are 
relatively well behaved, whereas conditionally integrable functions 
need to be handled with care.) 

8.8.2 Suppose that f and g are real-valued functions defined on [0, 00) 
which are Riemann integrable on [0,6] for each 0 < b < oo. Sup- 
pose also that p and q are conjugate indices for which the improper 
integrals [5° |f(x)|? da and f>° |g(«)|? da are finite. Show that fg is 
absolutely integrable, and establish Holder’s inequality for improper 
integrals: 


| Healey del < [| Fl@)g(a)| da 
< (flares) _ ([ \a(ertar) hae 
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8.8.3 Prove carefully that 


°° cos x °° sine 
dx = dx. 
fo Pee= fee 
Show that the first integrand is conditionally integrable, and that the 
second is absolutely integrable. 
8.8.4 (Euler’s summation formula) Suppose that f is differentiable on 
[0, oo). 
(a) Suppose that the improper integral dee f(x) dx and the improper 
sum )7¥-, f(j) both exist. Show that 


n= f poyaes [ele sear 


(b) Suppose that f is decreasing and that f(«) — 0 as x — oo. Show 
that the improper integral fj) (a — |a]) f’(x) dx exists and that 


Lx 


xX lee) 
df - / f(t) de / (w — |x|) f"(@) de. 


8.8.5 Suppose that f is a monotonic function on [0,7/2]. Show that 
a /2 


o f(x)sinna dz — 0 as n— oo. 
8.8.6 Establish the identity 
(sin 2naz — sin(2n — 2)x) cos x = (cos 2nz + cos(2n — 2)x) sin x. 


Show that 


7/2 sin Qnax 
: cos 2 da = 7/2: 
0 sin @ 


Show that 1/ tan x—1/z is a bounded monotonic function on (0, 7/2]. 
Show that 


Show that 


Bec 
sin @ TT 
dz = ~. 

(0) x 2 


(This is an ingenious proof. We shall see in Volume III that complex 
analysis avoids the need for such ingenuity.) 
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8.8.7 Show that 


| 1 1,4" ™1—(1-t/n)" 
jes ees ee ere =| " ax = | ieee [”) dt. 
2 3 n 0 1-2z 0 t 


8.8.8 The gamma function is defined as 
[o-e) 
Cer = i t?-1e-t dt for 0 < x < oo. 
0 


Interpret this as an improper integral, and prove carefully that 
I'(a@ +1) = aI (x). Deduce that [(n+ 1) = nl, forn EN. 
8.8.9 Show that 


y= jim (fo a pa). 


Can we take the limit inside the integral? Yes, it is possible to prove 
this directly, but the limiting process becomes much clearer when 
these integrals are treated as Lebesgue integrals. 

8.8.10 Prove the following continuous versions of Dirichlet’s test and Abel’s 
test. 
(a) Dirichlet’s test. Suppose that ¢ is a decreasing non-negative func- 
tion on [0,00) and that ¢(a) — 0 as x — oo. Suppose that f is a 
function on [0,00) for which the Riemann integral F(x) = fj f(t) dt 
exists for all x € [0,0o), and for which F’ is bounded on (0, co). Show 
that the improper Riemann integral {5° (t) f(t) dt exists. 
(b) Abel’s test. Suppose that ¢ is a decreasing non-negative func- 
tion on [0, 00). Suppose that f is a function on [0,0o) for which the 
improper Riemann integral F(a) = tee f(t) dt exists. Show that the 
improper Riemann integral ihe g(t) f(t) dt also exists. 
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Introduction to Fourier series 


9.1 Introduction 


Recall that a function f defined on R is t-periodic (where t > 0) if f(s+t) = 
f(s) for all s € R: t is called a period of f. If f has period tg and a > 0 
then the function f(at) has period to/a. Thus, by scaling, we can, and shall, 
restrict attention to 27-periodic functions. For example, the functions cos nt 
and sin nt, for n € Z*, are examples of 27-periodic functions. More generally, 
a function of the form 


m n 
p(t) = ag/2+ Ss" a; cos jt + Ss" b; sin jt, 
j=l j=l 


where aj,b; € R, is called a real trigonometric polynomial. Trigonometric 
polynomials are 27-periodic. The question that Fourier asked, and began to 
answer, is ‘If f is a 27-periodic function, can it be expressed as a limit of 
trigonometric polynomials?’ This question has led to an enormous amount 
of mathematics, which has many fundamental applications to the physical 
sciences. We shall however restrict our attention to the mathematical analysis 
of Fourier’s question. 

Suppose that f is a 27-periodic function. If f is Riemann integrable over 
the interval |—7,7], we say that f is locally Riemann integrable. Note that 
this implies that f is Riemann integrable over any bounded interval, and that 


T tota 
Pi d= / f(t) dt for any to € R. 
—T t 


oT 


The set of all locally Riemann integrable 27-periodic functions forms a vector 
space, which we denote by V. An element of V is bounded. If f € V we set 


I= 52 f [f)|dt and [l= sup IF] 


te[—7,7] 
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The function ||.||,, is a norm on V: it satisfies 


IF + Glloo S MFlloo + Mlle » 
lo Flloo = lat: Il Flloo 
and ||f||,, = 0 if and only if f = 0, 


for f,g € V anda € R. The function ||.||, is a semi-norm on V: it satisfies 
the first two conditions, but not the third. If f € V and ||f||, = 0, we say 
that f is a trivial function. 

In this chapter, we use the results of analysis that we have obtained so far 
to obtain results concerning the Fourier analysis of functions in V. A more 
advanced theory requires the theory of Lebesgue integration, and we shall 
consider this in Volume III. 

Suppose that p(t) = ao/2+ oi aj COS jtt jaa b; sin jt is a real trigono- 
metric polynomial function. Can we find the coefficients a; and b; from the 
knowledge of p? Here, and elsewhere, orthogonality relations play an essential 


role. Since 
cos acos b = $(cos(a + b) + cos(a — b)), 
sin asin b = $(cos(a + b) — cos(a — b)) 
and sinacosb = 3(sin(a + b) + sin(a — b)), 

and since 


J cosmeat = | sin nt dt = 0 


for m,n € Z, m ¥ 0, it follows that 


Tv TT Tv 
iL cos mt cos nt dt = / sin mt sin nt dt = / sin nt cos pt dt = 0 
TT TT TT 
for m,n,p € Z,m # n. Since 
cos’ t = $(1 + cos 2t) and sin? t = $(1— cos 2t) 
it follows that 
Tv Tv 
i cos? mt dt = f sin? nt dt = 7, for m,n € Z,m 0. 

TT: il. 

Hence 


1 Tv 
aj = -| p(t) cos jt dt for 7 € Z*, 


= 1 


1 Tv 
b; / p(t) sin jt dt for 7 EN. 
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Note that this justifies the fact that the constant term appears as ag /2, rather 
than ap, in the definition of a trigonometric polynomial. 

This suggests that we consider similar integrals for more general functions 
than trigonometric polynomial functions. Suppose that f € V. We set 


1 TT 

a;(f) = - f(t) cos jt dt for j € Zt 
1 TT 

a(f)= = f(t) sin jt dt for 7 EN. 


10 


The numbers a;(f) are the Fourier cosine coefficients of f and the numbers 
b;(f) are the Fourier sine coefficients of f. The Fourier series of f is then 
the formal expression 


f(t) ~ ao(f y+ Dat )cos jt + 5” bj(f) sin jt. 


We can write ap in another form. Let Ao(f) = ao(f) and Aj(f) = 


(a;(f)? + 6;(f)? )2, for 7 € N. There exists ¢;(f) € (—a,7] such that 
cos j(f) = aj(f ACE ) and sin };(f) = —b;(f)/Aj(f). Then 


a;(f) cos jt + b;(f) sin jt = A;(f) cos j(t + %;(f)), 
so that 


f(t)~ 2+ Sal ) cos j(t + 6;(f)). 


The quantities A;(f) cos j(t + ¢;(f)) are the harmonics of f. Aj(f) is the 
amplitude of the harmonic and ¢;(f) is its phase. The process of calculating 
the harmonics is called harmonic analysis. 

Note that we use the symbol ~ rather than an equality sign. As we 
shall see in Section 9.6, the sum need not converge, even when f is a con- 
tinuous function. Our aim will be to see when it does converge to f(t), 
and to see if there are other ways to approximate f, using the Fourier 
coefficients. 

Let us remark that we can consider the cosine and sine series separately. 
Let 


felt) = a(F(t) + f(t) and fo(t) = 5(F(4) — f(-4)). 
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Then fe is an even function (fe(t) = fe(—t)), fo is an odd function (f(t) = 
—f,(—t)), and f = fe + f.. Further, 


aj(fe)=a;(f), — b;(fe) = 9, 
aj(fo)=0, dj (fe) = ;(f), 


so that 4s 
fe(t) ~ ao(f)/2+ )_ aj(f) cos jt and fo(t) ~ S— bj(f) sin jt. 
j=l 


j=1 


Note that if f is an even function then 
oft or 
aj(f) = — f(t) cos jt dt for 7 € Z", 
0 
and that if f is an odd function then 
ee ie bated : 
bt) = =| f(t)sin jt dt for 7 EN. 
T JO 


Suppose that f is any Riemann integrable function on the interval [0,7]. We 
can extend f to an even 27-periodic function by setting f(t) = f(—t) for 
t € [—7,0], and setting f(27k + t) = f(t) for t € [—a,7] and k € Z. Note 
that the extension is continuous on R if f is continuous on [0, 7]. Thus Fourier 
cosine series become a tool to consider functions defined on the interval [0, 7]. 


9.2 Complex Fourier series 


We have seen that Euler’s formulae 


= 1; . 1» ¥ 
et =cost+isint, cost = 5 +e"), sint= ae —e*), 

i 
are useful, when manipulating formulae involving sine and cosine functions. 
There are however stronger underlying reasons for considering the com- 
of 27-periodic 


plex case. We consider the doubly infinite sequence (7n)°o_ 5, 


functions defined by 

an(t) = ot 
The subset T = {z € C: |z| = 1} of C is a group under multiplication. 
Each function yp, is a 27-periodic continuous homomorphism of the additive 
group (R,+) into T (which is surjective if n ¢ 0), and in fact every such 
homomorphism is of this form (Exercise 9.2.1). Further, the set {yn :n € Z} 
is a group under point-wise multiplication. 


244 Introduction to Fourier series 


We therefore consider complex-valued 27-periodic functions. These are 
functions of a real variable t: if f = u+ iv, where u and v are the real 
and imaginary parts of f, then f is continuous, or 27-periodic, or Riemann 
integrable over a bounded interval [a,b] if and only if u and v are, and the 
integral ie f(t) dt is defined as 


[roas fuoarif vou 


We therefore consider the complex vector space, which we again denote by 
V, of complex-valued locally Riemann integrable 27-periodic functions, and 
define |].||, and ||.||,,, as in the real case. If f € V and ||f||, = 0, we again 
say that f is a trivial function. 

A function of the form )1""__,, ¢j7/; (where each cj € C) is called a complex 
trigonometric polynomial. Fourier’s question then becomes ‘If f € V, can f 
be expressed as a limit of complex trigonometric polynomials?’ 

Suppose that f and g are in V. We set 


(F.9) = 5 fa at 


Note that this definition involves a complex conjugate. The function (f,g) > 
(f,g) isan example of a complex semi-inner product; we shall study these fur- 
ther in Volume II. Let us list some of its properties, which follow immediately 
from the definition. 


(ff) = slf@P de > 0. 
© (9, f) = (f,9). 
e (aif + a2f2,9) = a1 (f1,9) + a2 (f2, 9). 
e (f, Gigi + Gage) = Ai (f, 91) + 22 (f, 92). 


The functions (Yn)P=_., then form an orthonormal set: 


1 ifm=n, 
(Yms Yn) = 


0 ifmF#n. 


If p= ek cj7j is a complex trigonometric polynomial, then it follows 
from the orthogonality relations that c; = (p,yj) for —n < j < n. We 
consider similar semi-inner products for functions in V. If f € V, we define 
its complex Fourier coefficients (fn)°u_.. aS fn = (f, Yn), So that 


2 1 [* — Lin f* 


fn f(t) yn (t) dt = — f(t)y=n (t)dt = 


27 J_x Qn J_ 


fl ye" dt. 


ee yd 
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In particular, fo = = i. f(t) dt is the average value of f over the interval 
[—7, 7]. We then write 


fr Yh = > (f, Yn) Yn 


We can define the cosine Fourier coefficients and the sine Fourier coeffi- 
cients for complex valued functions. If f € V, it is easy to pass between the 
complex Fourier coefficients and the cosine and sine Fourier coefficients. Let 
us define the reversal R(f) of f € V by setting R(f)(t) = f(—t). To avoid 
too many superscripts, we set C(f) = f and S(f) = R(f). We then have the 
following identities. 


Proposition 9.2.1 Suppose that f € V and thatne N. 


— 


fo = a0(f)/2. 

. fn = Onl f) + ta(f) and fin = an(f) — iba(f). 

. If f is an even function then i = fn = an(f), and if f is an odd 
function then fy =0 and fn = —f—-n = tbn (f). 


wm eo 


4. an(f) = Lf, + fen) and br(f) = 5 (fn — Fn). 
5. CD n= tx 

6. RO)n = fon 

7. Sa = fr: 7 

8. If f is real-valued, then f_n = fr. 


Proof The reader should verify these identities. 


Exercises 


9.2.1 Suppose that 7 is a continuous 27-periodic homomorphism of (R, +) 
into the multiplicative group T. There exists 0 < 6 < am such that if 
|t| < 6 then |y(t) — 1] <1. 

(a) Suppose that k > 27/6. Show that there exists n € Z, with |n| < 
k/6, such that y(20/k) = e27”/k, 
(b) Show that n does not depend upon k. 
(c) Show that if g € Q then y(27iq) = e274. 
(d) Use continuity to show that 7 = Yn. 
9.2.2 Verify the identities of Proposition 9.2.1. 
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9.3 Uniqueness 


The size of a function controls the size of its Fourier coefficients. We establish 
two fundamental inequalities. 


Theorem 9.3.1 (Bessel’s inequality) Jf f € V, then 
3S inl? < ff) = an If? at. 
n=—0o 7 2n HE. 


Proof Let py = a=) F575: Then (f,¥%) = (Dn, Ye) for k € Z, so that 
(f; Pn) = (Pn Pn), and similarly (Pn, f) = (Pn Pn)- Hence 


O< (f — Pn f — Pn) = ad) — (f, Pn) ~ (Pn; f) eo (Pry Pn) 
=(f,f) — Wnspa) = (Ff) -— So Al. 


j=—n 


Since this holds for all n € N, the result follows. 


In fact, equality holds; we prove this (Parseval’s equation: Corollary 9.4.7) 
later. 


Proposition 9.3.2 If f © V then 


fal < al Lf) |dt < sup |f(O| at 


~ Qn Jaz 


Proof For 


Fal a 


af seoemeat| <2 f" itoemtae= 2 f” reolae 


i 27 


Corollary 9.3.3. If f is a trivial function, then fn =0 for all n € Z. 
More importantly, the converse is true. 
Theorem 9.3.4 If f © V and ie = 0 for alln € Z then f is trivial. 


Proof Let f = ut iv. Since ao(f) = 2fo, and ant). = 5(fn + 7) 
and bn(f) = 4( fn — f-n) for n € N, the Fourier cosine and sine coeffi- 
cients of f are all zero, and so therefore are the Fourier cosine and sine 
coefficients of u and v. Consequently, if p is a real trigonometric polyno- 
mial, then + [” u(t)p(t) dt = 0 and + {™ v(t)p(t) dt = 0. Suppose that 
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oa fr. | f(t)| dt > 0. Then one of 


1 d 
ut (t) dt, — 


=f sf (t) dt, wf (t) dt, wf (t) dt 


is non-zero. Suppose that ~ f"ut(t) dt > 0. (The argument in the other 
cases is essentially the same.) By considering a lower sum for the Riemann 
integral of u*, we see that there exists an interval [to — 7, to-+7] in [—7, 7] and 
A > 0 such that u(t) > A for t € [to — 7, to + 7]. The idea now is to find a real 
trigonometric polynomial which is large on the interval [tp — /2, to + 7/2], 
positive on the interval [to — 7,to + 7] and bounded in modulus by 1 for 
other values of ¢ in [—7, 7]. Let a = cos7/2 — cosn: then a > 0. Let I(t) = 
1+ cos(t — to) — cos7. Then 


l(t) > 1+ a for t € [tp — 7/2, to + 7/2], 


(t) 
I(t) > 1 for t € [to — 7, to +m] 
ie 


and |I(t) 


1 for other values of t in [—7, 7]. 


0} to-7 to 1 ford to+7] 


Figure 9.3a. The function I(t). 
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Let M = sup;er ut (t). Thus if k € N then 


1 a: 
on f(t)(1G))* dt = + Io + Is, 
where 
oe a ae k M (to — n+ 7) 
— > 
ii oe u' (t)(U(t)” dt a ; 
1 totn 1 to+n/2 i P 
h=— ut (t)(U(t)* dt > | ut (t)(I(t)" dt > nA(1 +a)", 
QT to—n QT to—n/2 
a haleee's ; M(x — (to +)) 
me S . 
Ts oe ey (t) (L(t) dt > oe 
Thus 
1 Tv 


on i ut (t)(U(t)* dt > nrA(1 + a)* _M, 


which is positive for large enough k. Since /* is a trigonometric polynomial, 


we obtain a contradiction. 


Corollary 9.3.5 If f,g © V and fr = Gn for alln € Z then f —g 1s 
trivial. 


Here is an important application of the corollary. 


Theorem 9.3.6 Suppose that f is a continuous function in V and that 
De cas al Os hen Saar fj; 2 f uniformly as n > oo. 

Proof It follows from Weierstrass’ uniform M test that )7"__,, fj7j con- 
verges uniformly to a continuous function g in V. Then 


n= im = [So Or | Heat = fe. 
j=—n 


n—oo 27 a 


It therefore follows from the corollary above that f = g. 


This result is useful when we consider the indefinite integral of a function in 
V. If f € V, the function t > i f(s) ds is not necessarily periodic. Instead, 
we consider the function F(t) = le f(s) ds— fot, which is a continuous element 
of V. 
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Theorem 9.3.7 Suppose that f ¢ V. Let F(t aya s) ds— fot. Ifn #0 
then FE, = ify, /n. Further, \7r°_., |F,| < 00, so ae SF yj converges 
uniformly to F as n— oo. 


Proof Suppose that ¢ > 0. There exists a fore function g in V with 
a I, \f(s) - 9(s / < ae and with g = = fo. Let G(t = fi 9( s) ds — got. If 


t € [—7, 7], |G(t) — (t)| < fo | f(s) — g(s)| ds < €/2, see (ea, hl Se2 
foraln Ee N. Ifne - and n # 0, we ee by parts. 
: 1 [7 i ft ae i 
== ins dg = = —ins = —Gp. 
Gn= 5 f Glemds = 5 | (ols) — fader ds = “9 


But |Gn— iA ee oe he | f(s) —g(s)|ds < €/4z, and so |F, —if,/n| < €. Since 
this holds for all ¢ > 0, Fy, = ify |. 

It now follows from the Cauchy—Schwarz inequality, and Bessel’s inequal- 
ity, that 


1 1 
~ A “ “ 1 
» als (IfoP + 31 al (14293) ons 


nN=—0o n=1 


Thus 5°” —_ Fy; converges uniformly to F'asn — oo, by Theorem 9.3.6. 


Corollary 9.3.8 If f isa continuously differentiable function in V, then 
nee ha < oo and a £57; converges uniformly to f asn— oo. 


Let us give two examples. 


Example 9.3.9 Suppose that 0 < 6 < 7/2. Let Is(t) = 7/6 if 0 < |t| < 6, 
let I5(t) = 0 if 6 < |t| < m and let [s(t + 2k7) = Is(t) for k € Z. Then 


 2sin nd 
Is(t) ~1+ > aa 008 nt. 


Ay 


Figure 9.3b. The function J;(t). 
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Certainly ao(Is) = 1. (This is the reason for the choice of the constant 
m/6.) Further 


N N N Fe 
2 sinndcosnt _ n(t + 6)) — sin(n(t — 6 
) an (Is) cos nt = j ) ae a )) 


nr 
n=1 n=1 


N N i(N+1)a ia 
y sinna|] < ) eit) — eae = : ae : 
: ~ eta — ] |e'o/2 — e-ia/2| sin a/2 
n= n= 


Thus if |t| < 7, and if |t — 6| > 7 and |t+ 6| > 7, then 


N 
D_Gin(nt (t+ 5)) — sin(n(t — 5)))| < 2sinn/2. 


It therefore follows from the uniform version of Dirichlet’s test that the 
Fourier cosine series converges to a continuous function on [—7, 7] \ {6, —d}, 
and converges uniformly on 


[ana] (6 = G0 +a) UO (0 == +): 


Does the Fourier series converge to I5? We shall consider this question in 
Section 9.6. 


Example 9.3.10 Suppose that 0 < 6 < 7/2. Let 


T1=4) a0< |q =< 26, 
Is(t) = 5 ( 35) 
0 if26 < |t| < 7, 


and let J5(t + 2km) = Js(t) for k € Z. Then 


S\ 2sin? nd 
Js(t) ~ 1+ Ms —2g3 008 nt. 


n=1 
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Figure 9.3c. The function Js(t). 


Once again, ao(Js) = 1. If n > 0 then, integrating by parts, and using the 
identity 2sin? a = 1 — cos 2a, it follows that 


26 
anlda) = 5 f( 
1 


(c= 
26 
= — sin nt dt 
al 


1 
= Tap it — cos 2nd) = 


t 
55) cos nt dt 


2 sin? nd 
n262 
In this case, all the Fourier coefficients are non-negative, and 
OP 9 Gn(Js) < 00, and so the Fourier series converges uniformly to Js. 
When 6 = 7/2 then Js(0) = 2, a2x-1(Js) = 8/(2k — 1)?x?, and agx = 0. 
Hence 


= 8 
2=1+) OED 


k=1 
so that 
1 _ nr 
= (2k — 1)? 8 
Since 
a 1 aa ei 
do > 2 Gap +) Op 8 Aaa? 
it follows that 
ene 7 
De s, 


This famous equation was proved by Euler, and was one of his early triumphs. 
But Euler did not know about Fourier series: we give another proof, due to 
him, in Section 10.8. 
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Exercises 


9.3.1 Show that Ja 
oe 1 1 2.12 
k = 
1+ di V) (aot ae) = 16 7 


k=1 


9.3.2 Let f(t) = t? for t € [-7,7], and extend by periodicity. Calculate 
the Fourier coefficients of f, and obtain another proof of the equation 
Sly Ha 16. 

9.3.3 Let f(t) =1 for |t| < 7/2, let f(t) = —1 for 1/2 < |t| < 7, and extend 
by periodicity. Calculate the Fourier coefficients of f. 

9.3.4 Let f(t) = a — 2|t| for t € [—7, 7]. Use Example 9.3.10 to calculate the 
Fourier coefficients of f. 

9.3.5 Suppose that f is a continuously differentiable function in V. Obtain 
an upper bound for )>°° fal: 


9.3.6 Suppose that (b,)?2, is a decreasing null sequence of real numbers. 
Show that }>°2_, bn sin nt converges for every t € R. Give examples to 


72, need not be the Fourier sine series of 


show that the sequence (b,,) 
an element of V. 


9.4 Convolutions, and Parseval’s equation 


Suppose that f € V and that 6 € R. We set T5(f)(t) = f(t— 4). Ts(f) isa 
translate of f. Note that 
aa 1) of 1 


seiG ge = = (t a ije dt = - fH re di= ae 


Ks(f)=If Tl = ge fF - Fe-dae 


20 jx 
K5(.) is asemi-norm on V, and K5(f) < 2||f|l,- 


Proposition 9.4.1 If f € V then K5(f) +0 as 6 — 0. 


Proof A little thought shows that if f is the indicator function of a proper 
subinterval of [—7, 7] then K5(f) = 6/m for small enough values of 6, and so 
the result holds for f. It then follows from the semi-norm properties of K5 
that the result holds for step-functions. If f € V and ¢« > 0, there exists a 
step function g with || f — g||, < ¢/3. Then 


K5(f) < Ks(f — 9) + K5(g) < 2€/3 + K5(g) <e, 


if |5| is sufficiently small. 
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If f,g © V we define the convolution product f x g to be the function 


frolt)=5- f Flt—s)als) ds. 


Example 9.4.2 If fe V then fxn = fan: In particular, 


Yn ifm=n, 
* — 
erro 0 otherwise. 


For 
1 . in(t—s) 1 . int ,—ins int f¢ 
fx (t) = 5= € f(s) ds = —— ee f(s) ds =e" fr. 


27 J_x 27 


Here are some of the properties of convolutions. 


Proposition 9.4.3 Suppose that f, fi, fo,g€ V and ay,a2 € C. 
(1) fxg is a continuous function. 

(ii) fxg=gxf. 

(vi) T5(f) *9 = T5(f x9). 

(iv) (ar fi + aafe) xg = ai(fi xg) + a2(fo*g). 


Proof (i) The function f « g is certainly 27-periodic. If t,6 € R then 


fr t+8)- (Fas l=|—— | (fe-5-9) — FE 9))als)] ds 
—7+6 
= / f(t —s)(g(s +8) — 9(s)) ds 
—7+6 
= a _ F(t s)(g(s +8) ~ 9(s)) ds 


IA 


fle K5(g), 


and so the result follows from the preceding proposition. 
(ii) Making the change of variables u = t — s, it follows that 


wT t+r 
(9+ N= 5 f olt—wslwau= x [ft s)als)as 


=f _ Flt sols) as = (Fx 9)(t). 
(iii) For 


(Ts(f) * g)(t) = Rice — s)g(s) ds = (fx g)(t — 6) = Ts(f * g)(t). 


Qn J_ 


(iv) This follows nents from the definition of convolution. 
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Convolution is an essential element of Fourier analysis. 
Theorem 9.4.4 If f,g¢V andn€ Z then (f *«g), = fam 


Proof Here is a quick and easy proof. Changing the order of integration, 


(F*9)y = = ( ae f(t —s)g(s) as) eW int ay 


or) on. 
i . 1 . —in 

= =| (5. f(t—s)e at) g(s) ds 
1 [7/1 [7 . 

Fe ( f(ujemet) iu) g(s) ds 

=f, i i: —ins ( )d = f, a 

= Inge fe eas) ds = 7,0 


Unfortunately, we need to justify the change of order of integration. We do 
this for continuous functions in Volume IT, and, more generally, in Volume ITI. 

Instead, we proceed as follows. Note that Is «Is = Js, and that ((Js),,)? = 
(Fs) ns where Js and Js are the functions of Examples 9.3.9 and 9.3.10. Thus 
the result holds when f = g = J;. It now follows from Proposition 9.4.3 that 
if D is a dissection of |—7, 7] into intervals of equal length and if f, g are step 
functions that are constant on the intervals of D then ( fixe 4). = to 

We now use a standard approximation argument. If f,g © V and 
€ > 0, there exist step functions h and j, of this form described above, with 
lf — Allg < € and ||g -—J|l, <¢. If ne N then 


fxegehxjt+(f—h)xg+hx(g—J) 
and Fane = hte + fe = hn) Gn + hn( fr a Vea 


Hence 


see ae es, 


(F*9)n — (B* In| = SMF —h) * gal + (h* (9 — F))nl 
J E(Illoo + WPlloo) S AMlglleo + WF lloo + ©): 
Similarly, 
J E(Mlloo + WPlloo) S AMGlloo + WF lloo + ©): 


Since (hx j Jn = Anjn; it follows that 


(F*9),, — Fa9nl < 2€((l9llog + llflloc +2): 


9.4 Convolutions, and Parseval’s equation 
Since € is arbitrary, it follows that ( pes 2), = aie 
Corollary 9.4.5 Sv. frgne™ converges uniformly to (f x g)(t). 


Proof By the Cauchy—Schwarz inequality and Bessel’s inequality, 
1 1 
lo) lo) 2 ioe) 9 
> lanl (30 Ut) (30 tt 
n=—0o n=—0o n=—0oo 


< (2 [inora)* (2 [™ iatorat)* <0 


and so the result follows from Theorem 9.3.6. 


Nie 


Corollary 9.4.6 Jf f,g,he V then (fxg)xh=fx(gxh). 
Proof For each has Fourier series }~>-__ foe Gn-hne™. 
We can therefore write f * g * h for the common value. 

Corollary 9.4.7 (Parseval’s equation) 

1 oe = 

3, | fa de = YO fade 

n=—0o 

In particular, = ee Fits) ee raw 


Proof As in the previous section, let S(g)(t) = g(—t). Then 


a a ae. 


(S(g) * f)(0) = on f(t)g(t) dt and (S(g) * f)n = fnGn- 
Example 9.4.8 
re ee ee. ar 
5s aa 


Let f = Ip/2 * Jpg. Then fo =1; ifn > 0 then 


is 0 if n is even, 
"| 8(-1)8/n3 (2k +1)3 ifn = 2k +1 is odd, 
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and f, = fen ifn <0. Also f(0) = 3/2, and so 


3 Gi 
ea) 
p= 1+16) oEsTp 
k=0 
which gives the result. 


Exercises 
9.4.1 By applying Parseval’s equation to J;/2, show that 


oe) 


S- 1/n* = 1*/90. 


n=1 
9.4.2 Calculate the function I,/2 * J;/2, and deduce that 


(oe) 


So 1/n8 = 19/945. 


n=1 
9.4.3 Calculate the function J,/2 * J;/2, and deduce that 


(oe) 


N° 1/n8 = 18/9450. 


n=1 


9.5 An example 


Things can go wrong! We now give an example of an even continuous periodic 
function whose Fourier series fails to converge at 0. First, let f(t) = sin 2j|t], 
for 7 € N. Then f; is an even function, and 


2 Tv 
ao( fj) = a) sin 2jt dt = 0. 


Ifn € N then 


2 TT 
Gide) =f sin 2jt cos nt dt 


y) Tv 
=— i sin(27 + n)t + sin(2j — n)t dt 
T JO 


2 2 
= | if n is odd, 
m2j+n) | mQj—-n) 
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and an(fj;) = 0 if n is even. Note that an(f;) is non-negative if n < 27 and is 
negative if n is odd and greater than 27. We now set 


2r—1 


sor( fj) = Yan fj) cos0 = S- Gy tas 


j=l 


Note that s,(f;) increases for r < j, and then decreases. The maximum 
value is 


Epes: Ly logs 
ene) ee ; 


On the other hand, if r > 7 then 


sar(fa) = aia vs a Ss 
ae yo £4 95-1 
= s >0 
k=r—j+1 ¥ 7 


Now let Nj = 23°+1 let 95 = fn, /i(G +1), let hy = Sy gj. and let 
k= yet gj. Since |f;(t)| < 1 for all t and all j, it follows from Weier- 
strass’ uniform M test that hy, — h uniformly, and so h is continuous. 
Further, a,(h) = limgoo Gn(hx), and s2,(h) = limp oo $2; (hx). Since all 
the summands are non-negative, 

5n,(fN;) log Nj 7? log 2 


Nee Gay eG al. mGeI 


Thus the sequence (s2,(h))°2, is unbounded, and so it does not converge. 


9.6 The Dirichlet kernel 
Suppose that f € V. Let Us look more closely at the partial sum s,,(f)(t) = 
yeas r fie eJ*, Since i= ool f(s)e~”’ ds, it follows that 


n 


sn(f)(t) = = r f(s) (So et) | ds = (Dre NO), 


jaan 
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where D,,(0) = S>"__,, 1 = 2n + 1 and 


jaan 


n 


4, sin(nt+ )t — sinnt 
D,(t) = est — 2. = + cos nt 
n ) x sin st tan st 


j=—n 


for 0 < |t| < a. The function D,, is called the n-th Dirichlet kernel. 
Here are the principal properties of the Dirichlet kernel. 


Theorem 9.6.1 Let tj = 2nj/(2n+1) and let Ij = (1/2m) fi, |Dn(t)| at, 
forl<g<n. 


(i) Dy is an even continuous function in V. 
(it) Dy is a decreasing function on [0, ti]. 
(it) Dialty) = 0; Drlt) SU aftr SUX ty, where.g t5 odd, “and Dt) <0 
Uf tj-1 << Gj; where4 is even, for 1 <9 <n. 
(iv) I,<landl>tIp>-:->I,>0. 
(v) If-~7<a<b<7 then |(1/27) f{? Dat) dtl <2: 
(vi) 1;(t) > 2/n?j, and = J”. |Dn(t)| dt = (4/1?) log n. 


Proof (i) and (ii) Each of the summands in the definition is continuous, 
even, and decreasing on (0, ti]. 

(iii) Since sin 5¢t > 0 for 0 < t < 7, this follows from the corresponding 
properties of the function sin(n + 5)t. 

(iv) Since 0 < D,(t) < 2n+1 for t € (0,t1], it follows that 0 < lh < 
(2n + 1)t1/2a = 1. Further, if 1 < 7 <n then 


1 t+ |sin(n + 5)¢| 1 /® |sin(n + 4)¢| 
Tja1 = a | dt = a | dt 
Qa Jt sin 5t 2x Jo sin 5(t+t,) 
1 f® |sin(n+ 4)t 
fo Jomo 
2m Jo sin 5(t + tj-1) 


(v) It is enough to show that |(1/27) s D,(t) dt| < 1, for 0 < b < a, since 
Dry is an even function. This follows because the integral is the sum of terms 
which decrease in absolute value, and alternate in sign. 

(vi) If tj—-1 < t < t; then 


|sin(n + 5)¢| x 2| sin(n + 5)¢| 


D,(t)| > ; 
| n ) a sin st; 7” tj 


so that 
1 ty 


2ty 
ro ah 
2rt; t; = 


1 Hn 1 2 
: 1 ea) 7 1 


1 
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Thus 
1 Ad. A 
= _|Palt)| at > LG > — logn. 
H Ay 
y=Ds5(t) 
\/ es > 
1 5+ 1 
t=-T t=T 


Figure 9.6. The Dirichlet kernel Ds. 


The Dirichlet kernel is not very well behaved. First, it takes both positive 
and negative values. Secondly, = ee |D,(t)| dt + co as n — ov. This last 
property underlies the fact that Fourier series of continuous functions in V 
need not converge point-wise. 

Nevertheless, the Dirichlet kernel can be used to provide useful information 
about the convergence of Fourier series. We need to impose conditions that 
are generally stronger than continuity. Suppose that f € V, that t € [—7, 7] 
and that we want to investigate the convergence of the Fourier series of f at 
t. We set 


(f)(s) = a(F(t +s) + f(t — 8)) — f(t) for |s| <x, 


ns/2 forO 
6:(f)(s) = wo s/ a — 


and extend by periodicity. 
Note that ¢;(f) is an even function and that 6;(f) is an odd function. The 
function ¢;(f) isin V, and ||¢:(f)||, < 2 ||fllq- 
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Pers function @;(f) can behave badly near 0 (although the function 
f)(s)sinns is bounded on [—7,7]). If f is differentiable at t then 
— 0 as s — 0. Since the Dirichlet kernel is an even function, and 


O:( f)(s) 
gm J, Dn(t) dt = 1, 


i (Aeeis f(t) Da(s) ds 


1 Tv 
a 6:(f)(s) sinnsds + -{ ot( f)(s) cos ns ds. 
T JO 
We use this to give a criterion for the Fourier series of f to converge to 
f(t) at t. 
Theorem 9.6.2 (Dini’s test) Jf the improper integral 


== [ian ee =f late s)| ds 


is finite, then s,(f)(t) > f(t) as n — oo. 


Proof If «> 0 there exists 0 < 7 < 7 such that 


=f 16 (f Nisyids=1—— f 18:(f)(s)| ds < €/2. 


0 if|s—t]<n, 
g(s) = 
f(s) ifn<|s—t) <7, 


and extend g by periodicity to obtain a function in V. Then 6;(g) is an odd 
function in V which vanishes in (—7,7), and 


Let 


= =f 9:(g)(s) sin ns dst i: t(g)(s) cos ns ds = (Dn + O(n: 


Thus s,,(g)(t) > 0asn — oo, and so there exists no such that |s,,(g)(t)| < €/2 


for n > no. 
1 
a 6:( f)(s) sin ns ds 


On the other hand, 
lsn(9)(t) — (sn(F)(t) — FO) = 
a 1: f)(s)| ds < €/2, 


aa 


and so |Sn(f)(t) — f(t)| < € for n > no. 
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Note that the condition in Dini’s test is equivalent to the requirement that 
the improper integral (1/7) JJ" |¢:(f)(s)|/s ds should be finite. 
We can say more, if f vanishes on an interval. 


Theorem 9.6.3 (Riemann’s localization theorem) Suppose that f € V, 
that [a,b] C [z,7] and that f(t) = 0 for t € [a,b]. Suppose that 0 < 6 < 
(b—a)/2. Then s,(f)(t) + 0 as n > co uniformly on [a + 6,b — 6]. 


Proof We need two lemmas, of interest in themselves. 


Lemma 9.6.4 If f © V and n€ Z\ {0} then |fn| < Kompfnld pf 2: 


Proof Since e~i?(t+7/") = —e7imt, 
ry 1| 1 7 —in —in(t+7/n 
ial = 5] ac fh foc — nny a 
1) i. f* aay Koyals) 
= sla [00 - f+ miner ae < 
T TT 


Corollary 9.6.5 IfneéeN then 


lan(f)| S Kayn(f)/2 and |bn(f)| S Kajn(f)/2. 


Lemma 9.6.6 Suppose that f,g € V, that t © R and that 6 > 0. Let 
hi(s) = f(t — s)g(s). Then 


Ks(ht) S |19lloo K(f) + II flloo Ka(g)- 


Proof 
1 TT 
Kall) = = f [f(t 9+ d)g(s-+ 65) - f(t s)gls)| as 
<5 | lie s+0)- s(e—s)gls + d)lds+ 
a [lle —s)(o(s +8) — 9(s))\ds 


< Ill gf [9+ 8) — Fe 9)| ds + 


Illao se [Mats +8) —als))las 


= II9llo0 Ka(f) + II flloo K6(9)- 
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The importance of this lemma is that the right-hand side of the inequality 
does not involve ft. 
We now prove the theorem. Let 


0 AE |b] 38, OFl|e| = as 
g(t) = ie x 
1/tan5t ifd<|t| <q, 


and extend by periodicity. Then g € V. If t € [a+ 6,b — 6] then 


Sn(f)(t) = =f f(t — s)g(s) sinns ds + a f(t— s) cosnsds, 


so that 
lsa(F)(O)1 S 3 (glloo KjnlF) + lf loo Kxjn(9) + Kujn(f))- 


The right-hand side of this inequality does not involve ¢, and tends to 0 as 
n— oOo. 


Suppose that f € V, that to € (—a, 7] and that 0 < 7 < a. Let g(t) = f(t) 
if |t—to| < n, let g(t) =O ifn < |t—to| < 7, and extend by periodicity. Since 
f—g =0o0n (to — 7, to + 7), Riemann’s localization theorem says that the 
Fourier series for f converges at to (or at any point in (to — 7, to + 7)) if and 
only if the same holds for g: convergence is a local property, depending only 
on the values of f near to. 

Let us apply these results to the function Js; of Example 9.3.9. It follows 
that 


m/d uniformly in |t| < 6-1, forO< 1 <6 
Ssn(f)(t) — 427/26 ift=dort=—6d 
0 uniformly in d+ 7 < |t| <a, for0<n<7—-0. 


In particular, if we set 6 = 7/2 and t = 0, it follows that 


1 eget oe 
375 4 


We now show that if f is monotonic in an open interval, then the Fourier 
series converges at each point of the interval. Recall that if f is monotonic in 
an open interval J and that t € J then f(t+) = inf{f(s):s €J,s >t} and 
that f(t—) = sup{f(s):s €lI,s<t}. 


Theorem 9.6.7 (Jordan’s theorem) Suppose that f € V and that f is 
monotonic in an open interval I. If t € I then the Fourier series for f 
converges at t to 5(f(t+) + f(t-)). 
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Proof Wemake several simplifications: we remove the jump, and we localize. 
We can clearly suppose that f is increasing in J. By a change of ea. we 
can suppose that t = 0, so that we need to show that = fn= 5(f (0+) + 

f(0—)). We can also suppose that f(0) = $(f(0+) + f(0—)). Let j(0) = 
j(m) = 0, let j(s) = 1 for 0 < s < m7 and ie = —1 for —a < s < 0, and 
extend 7 by periodicity. Then j is an odd function, and so > ae jn = 0 for 
all n EN. Now let 


9(8) = f(s) — 3(F(0+) — FO-))i(s) — FO). 


Then 
SS oe = >> fe- F(0) 
k=—-n k=—-n 

and so we need to show that a gn = 0. 

From the construction, g(0) = 0 and g is continuous at 0. Suppose that 
€ > 0. There exists 6 > 0 such that (—d,6) C I and such that —¢«/5 < 
g(—0) < g(6) < &/5. Now set h(s) = g(s) if |s] < 6 and set h(s) = 0 if 
6 < |s| < 7m, and extend by periodicity. Since g(s) — h(s) = 0 on [—6, 6]. 
jan hy — Vin Gj 0 as n — 00, by Riemann’s localization theorem. 


j=—n 
Thus there exists ng such that 


nm nm 
p> hj — S> 9j| < €/5 for n > no. 
jen j=—-n 


By Du Bois—Reymond’s mean-value theorem (Corollary 8.6.4), there exists 
—d <c <6 such that 


6 


y= 5 fh Dals\h(s) ds 


= MO) [Patsy as+ 5? [- Dl 


Using Theorem 9.6.1 (v), it follows that 


asst f Dn(s) ds} + 


j=—n 


E [no 


s) de 
< —., 
5 


Thus | "7, 9j| < € for n > no. 
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Exercise 


9.6.1 Suppose that f € V and that f is monotonic and continuous in an 
open interval J. Show that the Fourier series for f converges uniformly 
to f in any closed subinterval of I. 


9.7 The Fejér kernel and the Poisson kernel 


If f is a continuous function in V, it is an easy matter to calculate its har- 
monics. On the other hand, the example of Section 9.5 shows that the partial 
sums $,(f)(t) = ae fie eJt need not converge to f(t). Can we use the 
harmonics to reconstruct f? This is the problem of harmonic synthesis. We 
give two important examples of harmonic synthesis. 

The first was given by Lipdét Fejér, at the age of nineteen. It is based on 
the idea that the average of terms in a sequence can behave better than the 
terms themselves. If f € V, we set 


n 


on(f) = —~ aif) = —= D1(D +f) = Fah 


nels aa 


where F;, = (SF-0 D;)/(n +1) is the Fejér kernel. Using the formulae 
2sin(j + 4)tsin 4¢ = cos jt — cos(j + 1)t and 1 — cos 2at = 2sin? at, 


we see that f,(0) = + 1 and that 


1 wnsin(j + 4)tsin dt 
F(t) = » 2a? 
nm+1 = 2sin* 5t 


i 1—cos(n+1)t\ 1° ( sin?((n + 1)t/2) 
nti 2sin? $t nti sin? 5t 
for 0 < |t| < m: 
The Fejér kernel has three important properties. 


° Fn (t)isa apie 4 sare 
e xt, Fi( o= 0% fr. Dl t)/(n + 1)= 1. 
° ee Fee as |t| < a}. 


A sequence of functions in V with these properties is called an approximate 
identity. 


Theorem 9.7.1 If (¢n)°29 is an approximate identity in V and f © V 
is continuous at to then (dp x f)(to) > f(to) as n > oo. If f is continuous 
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y=FS() 


“Vv 


t=—-T t= 


Figure 9.7a. The Fejér kernel. 


on a closed interval [a,b] and 0 <n < (b—a)/2, then (dn x f)(t) > f(t), as 
n— o, uniformly in [a+ n,b— 7]. 


Proof Suppose that « > 0. There exists 0 < 6 < m such that if |s — to| < 6 
then | f(s) — f(to)| < €/3. Then 


T 


(n* F)(to) = F(00)| = Ig f[ (Flto — 8) = F(to))on(8) ds] < th + +I 


—T 


where 
1 [78 
n= = | (Flto-s)|+|F(5)))6nls) ds 
< [flo sup{bn(t) : —1 St < -5}, 

6 

n= 5 f (\flto—s)~ f(s)))ou(s) ds <e/3, 
8 

p= = | (\Flto—s)| + |F(5)))6n(s) ds 


—T 


S lIflloo suPton(t) 5 St < 7}, 


so that there exists ng such that |(¢, * f)(to) — f(to)| < € for n > no. 
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If f is continuous on {a, 6] then it is uniformly continuous, and there exists 
0 < 6 <7 such that if s,t € [a,b] and |s — t| < 6 then |f(s) — f(t)| < €/3. 
Thus, choosing 6 < 7 in the preceding argument, ng can be chosen so that 
l(on x f)(t) — f(t)| < € for all t € [a,b], for n > no. 


Corollary 9.7.2 If f is a continuous function in V then on(f)(t) > f(t) 
as n — oo, uniformly in t. 


We have a version of Riemann’s localization theorem. 


Corollary 9.7.3 If f(t) = 0 on a closed interval [a,b] and 0 < n < 
(b—a)/2 then (dn x f)(t) +0, as n > 00, uniformly in [a+ 7,b— 1]. 


Proposition 9.7.4 If f € V and if sn(f)(t) — l as n — oo then 
On(f)(t) +l asn— oo. 


Proof This is part of Exercise 4.6.2. 
This has the following important consequence. 


Corollary 9.7.5 If f € V is continuous at to and if s,(f)(to) converges 
as n — co, then it converges to f(to). 


Proof For if sn(f)(to) —~ las n — oo, then on(f)(to) - | as n — oo, and 
sol = f(to). 


Corollary 9.7.2 shows that a continuous function in V can be uniformly 
approximated by trigonometric polynomials: we use this to show that a 
continuous function on [0,1] can be uniformly approximated by polynomials. 


Theorem 9.7.6 If f is a continuous function on [0,1] and € > 0 there 
exists a polynomial p such that | f(a) — p(x)| < ¢ for all x € [0,1]. 


Proof We need a lemma. 


Lemma 9.7.7 For each n € Z* there exists a polynomial T,, such that 
cos nt = T,(cost) for allt € R. 


Proof The proof is by induction on n. The result is true for n = 1 andn = 2. 
Suppose that it is true for all m <n, where n > 1. Then 


cos(n + 1)t = 2 cos nt cost — cos(n — 1)t = 2T,,(cos t) cost — T,-1(cos t). 


The polynomial 7), is called the n-th Chebyshev polynomial. 
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We now prove the theorem. Let g(t) = f(cost). Then g is a continuous 
even function in V, and so there exists n € N such that |o,(g)(t) — g(t)| < € 
for all t. But o,(g) is an even trigonometric polynomial, and so 


= Leost= Lon (cos t) 


for some constants co,..., Cn. Thus if = cost € [0,1] and p = YF =0 rari 
then 


|f(x) — p(a)| = |f (cost) — p(cost)| = g(t) — on(g)(4)| < €. 


The second example of harmonic synthesis is obtained by damping the 
contributions for large values of |n|. Suppose that f € V and that 0 <r <1. 
Then we set 


oe) 


P,(f)(t) = Ss" peer 


nN=— CoO 


Since | fh] < ||f\|,, the series converges absolutely, and converges uniformly 
in t. Thus P,(f) is a continuous function. Let us set 


n (oe) 


Pron = Ss" rll, and P, = Ss" rldly,, 


j=—n j=-CO 


Then P,., — P, as n — oo, uniformly in t, and so 


P..(f)(t) = jim =f Prn(t — 8) f(s) ds 


noo 27 


= lim (Prin * f)(t) = (Pr * f(t). 


n—- 


The function (r,t) > P,(t) is the Poisson kernel. Now 


P,(t) = soe tite yr ~tt _ 
j=0 
ee eee a 1—r? 
~ 1l-rett © 1—re-t ~ 1 —2rcost + r2° 


Note that P,.(0) = (1+7r)/(1 —r), that P,.(t) > 0 and that P, is an even 
function. Further, 


1 Tv So rd Te) 3 
=a = woe st d¢ | —1 
Lee pS (= ie «) 


268 Introduction to Fourier series 


Figure 9.7b. The Poisson kernel. 


Since 
_ 2 os Apel 
1—2rcost+r° = (1—r)*+ 2r(1 — cost) > 2r(1 — cost) = 4rsin* 5t, 


P,(t) — 0 uniformly on {t: 6 < |t] <a}asr 1, for0<d <7. 

Thus the Poisson kernel is an approximate identity (though here the 
parameter r is in [0,1), and we are concerned with limits as r increases to 1). 
Thus we have the following. 


Theorem 9.7.8 If f € V is continuous at to then P,(f)(to) > f(to) as 
r / 1. If f is a continuous function, then P,(f)(t) > f(t), uniformly in t, 
asTr / 1. 


Which of these two methods is more powerful? In order to answer this, we 
need a stronger version of Abel’s theorem (Theorem 6.6.5). 


Theorem 9.7.9 Suppose that (an)?29 is a real or complex sequence. Let 
Sn = Dj-9 aj, and let on = (YUh-9 83)/(n + 1), for n © Z*. Suppose that 
On ~ oa asn— oo. Then + ee S>oasxz / il. 


Proof The proof is very similar to the proof of Theorem 6.6.5. By replacing 
ag by ag — 0, we can suppose that o = 0. Let s, = eer aj, forn € Zr. 
Suppose that 0 < x < 1. Let f(x) = 0°29 ana”. Recall that 1/1 — 2)? = 
p(n + 1)a”. Each of the series )7?° 9 @nx” and S°°° (n+ 1)2” converges 
absolutely, and so by Proposition 4.6.1, the convolution product S77? 9 cnx” 
converges absolutely to f(a)/(1 — )?. But 


n 
Cy = S/ a(n + 1-7) =(n+1)on, 
j=0 
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and so f(x) = (1 — 2)? °° g(n + Lone”. 
Suppose that 0 < € < 1. There exists ng such that |o,,| < €/2 for n > no, 
and so 


(1-2)? SO sna”) < e(1—2)?( 0 (n+ 1)a”)/2 
< e(1—2)?(S “(n+ 1)2”)/2 = €/2. 
n=O 


On the other hand, the sequence (o,)?29 is bounded: let M = 
sup{|o,|: 2 € Zt}. Then 


nol 
| = 2)? SO (n+ Dona”| < (1—2)?Mno(no + 1)/2. 
n=0 


1 
Let 7 = (€/((M +1)no(no + 1))2. If 1-4 < @ <1 then 


nol 
\(1 — 2)? So (n+ 1)on2"| < €/2, 
n=0 
and so a a3 
| So anx”| = [F(x)| =| - 2)? Soin + lone” <e. 
n=0 n=0 


It follows that the Poisson kernel is more powerful than the Fejér kernel. 


Exercise 


9.7.1 Suppose that (a7,)°2 is a real or complex sequence. Let 5, = ae, a;, 
and let on = (749 8)/(n + 1), for n € Z*. Suppose that a, — o as 
n — oo. Suppose that K > 0. Let WK = {z: |l-—2| < K(1—|z|}. 
Show that }77° 9) anz" > o as z > 1 in Wx. 


10 


Some applications 


The theory that we have developed is meant to be used. In this chapter, we 
consider various applications of the results that we have established, and in 
particular will introduce some of the important special functions of analysis. 
Some details are omitted; you should provide them. We shall return later to 
some of the topics considered here in Volumes II and III. 


10.1 Infinite products 


Suppose that (aj) Zo is a sequence of real numbers, and that a; # —1 for all 


7 EZ". Let pp = [Tj-o(1 +4;). We say that the infinite product []j-9(1+4,) 
converges to p if p # 0 and p, — pasn — oo. If pp — 0 as n — c 
we say that the product diverges to 0. If the product converges, then an = 
(Pn — Pn—1)/Pn—1 — 0 as n — oo: this means that we can restrict attention 
to products for which a; > —1 for all 7 € N*, so that 1+ a; > 0 for all 
j €N, and p, > 0 for all n € N. The logarithmic function then enables us 
to reduce the problem of convergence of the product to the convergence of a 
sum. The function log is a continuous bijection of (0, 00) onto (—oo, 00), with 
continuous inverse exp. Hence p, — p (with p 4 0) as n — oo if and only if 


n 
S log (1 + aj) = log pn — logp as n > oo. 
j=0 


The general principle of convergence takes the following form. 


Proposition 10.1.1 Suppose that (aj) Ro is a sequence of real numbers, 
and that a; > —1 for all 7 € Z*. Then the product []Tj20(1 + aj) converges 
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if and only if, given € > 0, there exists no € Z* such that 


UG — 


Rika Pn 


form>n> no. 


Proof Let us prove this directly. Suppose that p, — p, and that « > 0. Then 
p/Pn — 1, and so there exists no such that |pm — pn| < €/2p and p/pn < 2, 
form > n> ng. Then 


m 
[] (+a) —1) = [Pe Pe) — Panay Pc 


form >n> no. 
Conversely suppose that the condition is satisfied. Then there exists no 
such that 


—-1)=| [] G+a,)-1)<1/2, 


m 
Dm | 
Pn yee 


form > n > no, and so pp,/2 < pn < 2p, for n > no. Given € > 0, 
there exists ny > no such that |(pm/pn) — 1] < €/2pn, for m > n > ny. If 
m>n> ny, then 


Pm _4 


|Pm — Pr = ‘Pn <6, 


n 


so that, by the general principle of convergence, (pn)?2.9 converges, to p say. 
Further, since pp > pn, /2 for n > no, p > pn, /2 > 0. 


If the infinite product TTj20( + aj) converges, then a; — 0 as j — oo. If 
aa as < Oo, we can say much more. 


Theorem 10.1.2 Suppose that (aj) 20 is a sequence of real numbers, that 
a; > —1 for all j € Z* and that ee as < oo. Then the infinite product 
Tj + aj) converges if and only if 6 a; converges. 


Proof We use the fact that if |h| < 1/2 then, by Taylor’s theorem, 


h2 
log(i+h)-h= fe im 
og( ) a1 + Oh)?, or some 0 < 6 < 


so that 
f= 2h? <log(l FAVS Ah. 
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There exists no such that |a;| < 1/2 for 7 > no. If m > n> no then 


S euti as ae Siena 4230 al, 


j=nt+l j=m+1 j=nt+l1 jg=nt+l1 


and it follows easily from this that ()7""_9 log(1+aj;))72o is a Cauchy sequence 
if and only if (F=0 aj) 7-9 is a Cauchy sequence. The result therefore follows 
from the general principle of convergence. 


Next let us consider the cases when the terms a; are all positive. 


Proposition 10.1.3 Suppose that (aj) is a sequence of positive num- 
bers, and that a; < 1 for all j € Nt. The following statements are 
equivalent. 

(i) VojX0 aj converges. 

(ti) [T7291 + aj) converges. 
(iit) []j2o(1 — aj) converges. 

(iv) If \bj| <1 and ajb; A —1 for j € Z* then []Xo(1 + bja;) converges. 


Proof If (i) holds and |b;| < 1 for 7 € Zt then yojz0 (4505)? < oo, and 
so (iv) holds, by Theorem 10.1.2. Clearly, (iv) implies (iii). Suppose that 
Tjz00 — a;) converges, to q, say. Then since 1 + a; < 1/(1 — aj), 


—1 
n 


(l+a;)<|[[G@-a)}] <1/q, 


j=0 


&. 
ll 3 
oO 


and so the increasing sequence ([]j_9(1 + aj))p&o converges: (ii) holds. 

Suppose that (ii) holds. Let pp = [[/_9(1 + aj) and let p = []#<o(1 + aj). 
Since a; — 0 as j — oo, there exists no such that a; < 1 for 7 > no. But, by 
the mean-value theorem, log(1 + x) = x/(1+ 0x) for some 0 < @ < 1 and so 
log(1 +2) > 2/2, for 0 < x < 1. Therefore 


no 


2 yas 2 3 log(1 + aj) = 2log(pr/Pn) < 210g(p/Pno) 


j=0 j=0 j=notl 


for n > no, so that a © 9 aj converges. Thus (i) holds. 


Corollary 10.1.4 If (a;)F20 is a real sequence, none of whose terms 
takes the value —1, and if jo 145 < oo, then the infinite product 
Tj + a;) converges. Further, if o is a permutation of Nt then 
[]Tj20(1 + @o(j)) converges, to the same value. 


Such a product is said to be absolutely convergent. 
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Exercises 


10.1.1 Use the logarithmic function to deduce Proposition 10.1.1 from 
Theorem 3.6.2. 

10.1.2 Let ag, = 1/Vk +1 and let agy4, = —1/Vk +1. Show that Lo aj 
converges, whereas eae! + a;) diverges to 0. 

10.1.3 Let ap = 0, let agp, = 1/Vk and let ag, = —1/Vk + 1/k, fork EN. 
Show that Lo a; diverges, whereas jd + a;) converges. 

10.1.4 Why do these examples not contradict Theorem 10.1.2? 

10.1.5 Let py < po < --- be an enumeration of the primes. By consider- 
ing products of the form Wj —1/p;)~+, or otherwise, show that 
ara — 1/p;) diverges to 0. Deduce that je (1 /P;) = 0. 


10.2 The Taylor series of logarithmic functions 


Integrating the identity 


1 (—2)” 
es weet (—g)® 1 4 
a ee a eae) a ae 
we see that 
2 n x n 
r (—2) (—t) 
og(l+a)=2 ia 7 +f Tap 


for « > —1. Suppose that —1 < # < 1. Then the remainder term tends to 0 
as n — oo, and so 


7 (—x)” asd (<1) 
log(1 = mr ee ee 
og(l+2)=2 5 - 7 + » 7 

Since 
i a 2 a” 4 
e) t)=-2x ao 
g 5 - , 
it follows that 
1 3 5 
log oe Seite) lian jee po ee), 
1-2 3 5 


for —1 < x2 < 1. We shall use this formula when we establish Stirling’s 
formula. 
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Exercise 


1 1 nlogn 
= (145) —-lasn-—ow. 
nr nm 


10.2.1 Show that 


10.3 The beta function 
The beta function B(x, y) is defined for x > 0 and y > 0 as 


1 
Bane [ 11 — YL at, 
0 


Note that if « < 1 or y < 1 then this is an improper integral. Note also 
that B(x,1) = 1/ax. The change of variables s = 1 — t shows that B(z,y) = 
B(y, x). If we make the change of variables t = sin? 6 then 1 — t = cos? @ and 
dt /d0? = 2sin 6 cos @, so that 


a /2 
Bey) = 2 | sin??—1 cos*¥-! dé. 
0 


Proposition 10.3.1 IJfx2>0 andy > 0 then 


_ yB(x,y) 


B(x, y+ 1) rar 


Proof Integrating by parts, 
1 
B(z,y+1)= i: agee ey rd? 
0 


: 1 
a: pre's ade 


0 


1 —1 
i ea fe ye} + few ae | ‘i dt 
at+yt 9 @t+Yy Jo t 


y f° y 
= ff #11 — t)¥" | dt = —-B(a, y). 
z+y Jo r+y 


This means that we can calculate the value of B(x, y) for all positive x and 
y if we can calculate it forO <2 <land0<y< 1. Let 


mw /2 
n=} sin’ 6 d@ for s > —1. 
0 
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Corollary 10.3.2 (7) B(1/2,1/2) =7. 
(ii) B(a/2,1/2) = Tet. 
(iii) sI, = (s —1)Is-2, for s > 1. 
(iv) Ifk © N then 


afe (2k —1)(2k —3)...1 
2s + 2k = 
n= f A cae CTA 


(2k)! m1 @E 


~ 22k(kN2°2 22h \ kg 2 
and 
/2 (2k) (2k —2)...2 2?*( kl)? 
= : Qk+1 7 eee “- % 
bu = | een Qk+ )@Qk-)...3''~ @k+ yr 


Proof These results all follow easily from the equation 


a /2 
Bey) = 2 | sin??1 cos*¥—! dé. 
0 


Now 
1> Top44 = 2k ent = 2k ' 
~ Top 2k+1 In, ~ 2k+1 


so that [op41/lo, — 1 as k — oo. Thus we obtain Wallis’ formula 


Berth kl)= 
(2k + 1)1(2k)! 


>Task—o. 


Let us establish a corresponding result for an infinite product. Since 


Oe On Ok 1 oF kl 1 


(Qk)! 12.....2k  138.....ak—1) 4 e+)! ~ 35.....Qk+1)’ 


it follows that 


ORNS ne SA 2k) 
(Qk'Qk +1)! 13.3.5.....2Qk—-DQk+1) 


k \2 k 
2 1 
“Ul Gate Tha 
aN j pale j 
Thus it follows from Wallis’ formula that 


(oe) 
II (oe Bes 
4j2—1) 2° 


j=l 
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If0<x2<1then 


am /2 
B(z,1—2)= oy tan?*—! 6 do. 
0 


Our main concern now is to find an expression for this integral. Making the 
change of variables v = t/(1 — t), we find that 


co x—1 a6 
B(x, y) = | go es oii so that Bel _— x) = / 
: 0 


Lt ytty l+v 


Now 


1 yr-l 1 1 ye—ly 
a= fed vt t —1)"v") dv + =r f 
f e= fn (-1)%e) dv + (ay f 


The second term on the right-hand side tends to 0 as n — oo (why’”), and so 


1 ,a—1 
Vv 1 1 
dv = —4 1) : 
[ itv. 2 a ) L+n 


n=1 


Similarly, making the change of variables v = 1/w, we find that 


oo ,,x—-1 1 
1 
| u w= f ae, 
ty eG 9 w(1+w) 


1 
_ | wt 2 a wit Se aa cel ohe (—1)"w"* dw a“ (rt f 
0 0 


1 wrti-« 


dw. 
1l+w na 


Again, the second term on the right-hand side tends to 0 as n — oo, and so 


Oo yye—l oo , 1 
dv = S(-1)"" 
/ l+vuv " Dal ) n-2 


t=] 


Adding the two integrals, we find that 


B(x, ya2 fs Rpg eS aye (ho! 
Ly L) = F an =e 2 ee ae 
1 _ a 1 
== 2D 1) (a3). 


We shall see in Volume III that this sum can be evaluated; its value is 
w/sin m2. 
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The beta function is logarithmically convex: if x9, 71, yo and y are positive 
and 0 < 6 < 1 then, putting rg = (1 — ™)xo + 021, yo = (1 — 9)yo + Oy and 
using Holder’s inequality with indices 1/(1 — 6) and 1/6, we see that 


1 
B(x9, yo) = / t*°(1 — t)¥ dt 
0 


= fey ena omy — mye 


0 


- [wo — £)¥)1-9 (41 — £8 at 
0 
1 1-6 1 7) 
<(fwa-oma) .( [ @a-»”a) 
= Biro)" Bayi)": 


Exercises 


10.3.1 Show that «B(z,y+1) = yB(a+1,y). 
10.3.2 Show that B(x, y) = B(x+1,y) + (B(z,y +1). 


10.3.3 Show that 
n 
II 1 : > as nm — CO 
492 T ; 
j=l 


10.3.4 Show that afr? sin” t dt)? — 27 as n — oo. [Consider the cases n 
odd and n even separately. ] 


10.4 Stirling’s formula 
We wish to estimate the size of n!, as n becomes large. Let 


nie” 


mii and let bn = log An = log(n!) _ (n + 1/2) log n + n. 
nr 


Aan = 


Set s = 1/(2n + 1) (so that (n+ 1)/n = (1+ s)/(1 —)). Using the result 
about logarithmic functions established in Section 10.2, 


bn, — bn41 = log(n + 1) — (n+ 1/2) logn + (n + 3/2) log(n + 1) — 1 
= (n+ 1/2) log (“**) —1 


_i, l+s ee 
Ge | a ae ane 
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Thus 


IA 


or: an ae Sa oe 1 ail 1 
Par ee) nar ly): ion aoe Le 
s d i 1 


= >= 
and by — batt = 3 3Qn+12~ 1(n+1) 12(n+2) 


These inequalities show that the sequences 


[e,e) 1 = 
oat (6 Get), 


are both decreasing sequences, and that the sequence (b, — 1/12n)°°, 
is an increasing sequence. Thus all three sequences tend to a com- 
mon limit b. Consequently (ay)%,—a as n— 00, where a=e?, so that 
nl w~ an™tl/2e-n, 

It remains to determine a. We use Wallis’ formula. 


Art+1(n))4 qi 4nrtlypant2e—4n azn 


2 
(@n)ian+1)! ~ @On)™tle-™On+1)~ ntl ~?/ 


But 4°+1(n!)4/((2n)!(2n + 1)!) — m as n — 00, by Wallis’ formula, and so 
a= V2n. Thus we obtain Stirling’s formula 


1 
nin V2n.n™*2e-™, 


More precisely, 


el/12(n+1)/onn (=) <nl< el/2n Jann (=)" 
é e€ 


10.5 The gamma function 


Suppose that a > 0. The exponential functions e“” grows faster than any 
polynomial, as x — +00. On the other hand, n! grows faster than e®”, as 
x — oo, for any a € R. Can we find a continuous function f of a natural kind 
on [0,00) such that f(n) = n! for n € N? Perhaps surprisingly, the answer 
is ‘yes’; in fact, the function that we shall construct, the gamma function T, 
satisfies [(n + 1) = nl. 

We want to define the improper integral ic t?—le—' dt. We consider the 
intervals [0, 1] and [1, co] separately. 

First, suppose that 0 < a < 1. Then t®~'e~* — ooas a \, 0. But t? “be! < 
#2 + and de t7—1 dt = (1—*)/xz > 1/2 as € \, 0. Thus I(e) = fe pte dt 
is a decreasing function on (0,1] which is bounded above by 1/z, and so 
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the improper integral ie t?—le—! dt exists. If x > 1, the function t*~te~* is 
a continuous function on [0,1], and so the Riemann integral ie tote ae 
exists. 

The function ¢*~'e~* is continuous on [1,00). If n € N and n > x then 
ein so thatt te ale? Thus 


gh i be "1 _— ye-n ! 
[omtetasm f poor — WO- 7) a 


1 1 n-2x ~n-2 


Consequently, the improper integral ips t?—1e-* dt exists. 
We can therefore define the gamma function for 0 < t < co as 


re) = | te dt: 
0 


it exists as an improper integral. 
Note that [(1) = y° e~‘ dt = 1. 


Proposition 10.5.1 Jfx>0 then zI(x) =T(a+1). 


Proof Integrating by parts, 


xX xp,—-t)* x 
t 1 
/ eae c | +o f t®e* dt. 
€ x € x € 


Now t?e-* — 0 as t > 0 and as t — ov, and so it follows that 2I'(x) = 
Bed). 


Corollary 10.5.2 Ifn€ Z thenT(n+1) =n!. 


Proof The result holds for n = 1, since [(2) = I'(1) = 1. The result then 
follows by induction. 


Proposition 10.5.3 TI is a continuous function on (0,00). 


Proof Suppose that x € (0,00) and that 0 < a <a <b < oo. Suppose that 
€ > 0. There exist 0 <7 < 1< R< o such that 


1 oo 
| toe dt < €/5 and | ie le aS eS: 
0 R 


There then exists 0 < 6 < min(x — a,b — x) such that if |a — y| < 6 then 
|ev-+ — ¢7-1| < €/5(R—7n). If |x — y| < 6 then 


Py) —T(z) =o + (h - fb) + Us — J), 
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where 


R 
Ih= i Gt =P em di; 
ul 


Ul Ul 1 
ney t¥-+e-* dt, =f fe dé, 
0 0 


[o-e) [oe 
13 = pote a: Ih =a Pte oe 
R R 


The modulus of each integral is less that €/5, so that |[(y) — T'(x)| <. 


The beta and gamma functions are closely related. 


Proposition 10.5.4 IJfx>0 and y>0 then 
P(x)l(y) = Bla, y) P(x + y). 


Proof Changing variables by setting t = su, exchanging the order of inte- 
gration (this is justified in Volume II), and setting w = s(1 + u), we 


find that 
FOL iy) = (| geohe is) (/ oe at) 
0 0 

a gotg-? (/ stul-te*" du) ds 
0 0 

= i. uy cs gtty—l_-s(1tu) is) du 
0 0 
lee) oo ,,,a+y—-1,—w 

= uy! (/ a) du 
0 Gg Cheayere 


Setting v = u/(1+ uw), we find that 


oe | paar 
‘ G+ujety u= | U ( —v) U = (x,y). 


Exercises 


10.5.1 Show that 


ore) x—1 1 x1 
Vey= | (logy) v) dy = i (ios =) az. 
1 y 0 u 


10.5.2 Show that the gamma function is a logarithmically convex function 
on (0,00). 
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10.5.3 Show that if0 < x < 1 then 


1 1 
— <IT(a2)<14+-. 
ex vw 


10.6 Riemann’s zeta function 


It follows from the integral test that )77°,(1/j*) diverges if s < 1, and 
converges if s > 1. If s > 1, we set ¢(s) = )°72,(1/3°). The function ¢ is 
called Riemann’s zeta function; it is a decreasing function of s. 

It follows from the integral test that 


1 
s—l 


= [ “ ds <¢(s)=1+ 5 -(1/j%) 
1 j=2 


° d 1 
ee ee 
1 x Ss 


so that ¢(s) > co as s \, 1 and ¢(s) > las s > oo. 
It was Euler who first considered the sum as a function of the real vari- 


able s. Later, Riemann considered ¢ as a function of a complex variable; he 
introduced the notation ¢ for the function and s for the variable. 

Euler recognized the importance of the zeta function for number theory, 
and initiated the study of analytic number theory. Let 2 = p; < po <... be 
the sequence of primes, in increasing order. We shall use Theorem 2.6.6, the 
fundamental theorem of arithmetic: every n > 2 can be written uniquely in 


the form n = pf... pz", where a1,...,a, € Z* and a, #0. 
We set 
“ 1 TI 1 
n= TI (7aR) = (14 ; ) 
j=l 1— 1/p; j=l Bod 
Thus P,,(s) is a continuous decreasing function on (0, 00). 
Since 
: T+ : + : + 
ae +t, 
1—1/p; Pp; 


and since all the terms are non-negative, we can expand the products, to 


obtain 
1 


Pn(s) = S>{———— # fia. in € ZY. 


k kn 
(py +» Pn™)* 
This provides an analytic proof of the fact that there are infinitely many 
primes. For if there are only finitely many primes p1,..., Dn then every posi- 
tive integer can be written in the form p’" ... p*», so that P,(1) = ya L/i = 
oo, giving a contradiction. We can say more. 
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If s > 1 then it follows that P,(s) < }°72.,1/n* = ¢(s). The sequence 
(P,(s))°@, is increasing, and so P,,(s) converges to a limit P(s), with P(s) < 
¢(s). On the other hand, suppose that N € N, and let {pi,...px} be the 
set of primes less than N. Then every 7 < N can be written as a product of 
powers of p1,...p%, and so 


P(s) ) > Ph(s —, 


Since this holds for all N, P(s) > ¢(s), and so we have the following. 


Proposition 10.6.1 If1<s<o then 


where (pj < pg < ...) is the sequence of primes, arranged in increasing 
order. 


Corollary 10.6.2) S°°°_,(1/pn) = co 
Proof If not, then, by Proposition 10.1.3, 


1) 


is convergent to a non-zero limit P, say. But 
i 1 

Wels) = T] ( 7 5) > P 
j 


for s > 1, so that 1/P > ¢(s). Since ¢(s) — oo as s \, 1, this gives a 
contradiction. 


10.7 Chebyshev’s prime number theorem 


Again, let 2 = p, < po < --- denote the sequence of primes, in increas- 
ing order. The fact that ii 1/p; = co shows not only that there are 
infinitely many primes, but also that they occur fairly frequently; for exam- 
ple, if (a;)F2 ©, is a sequence of positive numbers for which a 14; < oo then 
liminf;—.. ajpj = 0. This raises the question; how are fhe prime numbers 
distributed? If z > 0, let m(a) be the number of primes not greater than x. 
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In 1792, Gauss, at the age of fifteen, conjectured that 7(x) ~ 2/log x: that 
is, (x) log a/a — 1 as x — oo. In 1850, Chebyshev showed, by elementary 
real analysis, that this was the right rate of growth. 

First, let us introduce some notation. Suppose that f is a real-valued 
function defined on N, and that x > 0. We set 


>= f() = dof f(e) : pa prime, p < x} 


psx 
> f(p) = >-{f() : pa prime, y < p < a} 
y<pcu 
and J f(p") = ST{F@") : pa prime, m €N,p" <a}, 
pr<x 


and use similar notations for products. 
Chebyshev introduced two auxiliary functions: 


=) log p and w(x =i Ne log p. 
psx pr <x 

He proved the following. 

Theorem 10.7.1 (Chebyshev’s prime number theorem) 


a(x) 


x 


a(x) 


x 


= lim infy_.66 ale) ‘ 


a(x) logx 
x x 


(4) lim infy 0 = lim infy_ 66 


v(x) 


a(x) log x 
x ois x: 


lim SUP» 550 = lim sup,_.5o = lim sup,_.5o 


(22) lim sup; 406 W@) < Ie 4 < 1.387. 


x 


(iii) lim infy oo X > Flog 2 > 0.346. 


x 


Proof of (i) Clearly @(x) < (a). Let cp(x) = sup{m: p™ < x}. Then 


wie) = S- Cp(x) log p = paES x) log a + phar: x) log x 


psx pSVva ioe 
< Valogar + (2), 


since c)(z) = 1 for /x < p < x. Thus ¢(2)/x — O(2)/t — 0 as & > ow. 

Consequently 

uz) and lim sup oe) = lim sup ule), 
x 


to «6 ro «6 


lim inf —~ = 
roo w—00 


284 Some applications 
Since cp(x) is the largest integer such that p™ < 2, 
C(x) logp < log x < (c,(x) + 1) log p, 


so that c,(a) is the integral part of log x/ log p. Consequently, 


l 
YC) = Pligg pl: og? S m(e) lower. 
PpSx 


Thus 
lim inf ue < lim inf reyes s 
Z—-Cco OU” L— Oo xv 
lim sup v(a) < lim sup 
L—00 x L—00 xv 


1(x) log x 


Suppose that 0 < a < 1. Then 


O(x) > Ss" log p > S- log x 


= (erlog 2)(n(a) — (2%) > (wlog.2)(m(a) — 2°), 


so that ; i 
(x) >a (74 Og zr 1° og 2) 
x x 
Since x°—! log > 0 as x — 00, 
lim inf ay) > alim inf WELLS 
roo 6 L—- OO x 
] 
lim sup ule) > alim sup noes 
Zo CU L—00 xv 
Since this holds for al 0 <a <1, 
lim inf eZ) > lim inf ne oeg 
@Z—00 L— OO xv 
] 
lim sup a) > lim sup mialoee 
go «OCU L— 00 xv 


Proof of (ii) It is sufficient to show that @(n)/n < log4, for n € N, 
with n > 2. We prove this by induction. Certainly 6(2) = log2 < 2log4, 
6(3) = log6 < 3log4 and @(4) = log6 < 4log4. Suppose that the result 
holds for 2 < 7 < 2n. Then, since 2n + 2 is not prime, 


6(2n +2) —O(n +1) =O(2n+1)-O(n+1)= So logp=logP, 
n+l<p<2n+1 
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where P = Tnpiep<onti p. Now if pis a prime andn+1 < p< 2n+1 then p 
divides (2n+1)! and does not divide (n+1)!, and so p divides C= = Gre), 
Thus P divides eee (Ce), But 


n+1 
2n+1 
an+1\ 1 ((2n+1\ | (2n+1 Os an+1 _ yn gn 
n+1 2 n n+ 1 2 j 


so that P < 4", log P < nlog4, and 
6(2n + 2) = 6(2n + 1) < O(n +1) + nlog4 < (2n+ 1) log4. 


Proof of (iii) It is sufficient to show that %(2n)/2n > 5 log 2, forn € N. By 
the fundamental theorem of arithmetic, ifn € N we can write n uniquely as 
hs2n p’?). The quantity Up(n) is the p-adic valuation of n. Since {1,...,n} 
contains |n/p| multiples of p, |n/p?| multiples of p?, and so on, 


(oe) 


up(n!) =) |n/p' J. 


j=l 


(Of course, this is a finite sum.) Now 


Cr)-5 e 
m pr2n 


= [J »™, 


where 


(oe) 


Gp() = vp((2n)!) — 2vp(n!) = S0|2n/p"] — 257 n/p" 
j=1 a= 


(oe) 


= 0 (l2n/p’| — 2Ln/p’)). 


j=l 
Now if |2n/p’| is even, there are as many numbers of the form ap’ in 
{1,...,n} as there are in {n+1,...,2n}, and if |2n/p’ | is odd, there is one 
less number of the form ap’ in {1,...,n} as there are in {n+1,...,2n} (why 
is this?). Thus 
|2n/p’ | — 2|n/p’ | = 0 if |2n/p’ | is even, 
= 1 if |2n/p’| is odd, 
so that 0 < gp(n) < @(2n). Consequently 


we (7 ye Ss" Gp(n) logp < S> Cp(2n) log p = w(2n). 


pan pr2n 
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But 
2n\ _ (n+1)(n +2)... (2n) a 
n V2 ents n ~ 
so that log (-”) > nlog 2, and y(2n)/2n > § log 2. 
Chebyshev also showed that 
1 1 
tie tae CUOE oy oii ay SO 
&~— 00 xv a bees xv 


but the real difficulty is to show that (x) log x/x tends to a limit as x — oo. 
Gauss’ conjecture was eventually proved independently by Hadamard and 
de la Vallée Poussin in 1896. 


Theorem 10.7.2 (The prime number theorem) z(x)loga/x — 1 as 
L— OO. 


Their proofs are difficult, and use the theory of functions of a complex 
variable. 


10.8 Evaluating ¢(2) 


An outstanding problem at the beginning of the eighteenth century was the 
evaluation of 


(j= 4224 a 
4° 9 n? 
One of Euler’s early triumphs was to show that ¢(2) = 77/6. We gave a proof 
using Fourier series in Example 9.3.10: here we present one given by Euler. 
We start by applying the binomial theorem. Suppose that 0 < © < 1. Now 


__% 2 ( (23) au 
vi A+ \ (27 + PIG? VI — a2)” 
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by Corollary 8.5.5. Integrating term by term (justify this carefully), 


tang) = 2 ae dt +2 > ( or 7 a asi at), 


for 0 < x < 1. Since all the terms are non-negative increasing functions of x, 
it follows from Theorem 6.3.10 that the formula also holds if we set x = 1: 


2 al 1 42j+1 F 
a ae 
8 Lx V1—-® «+> (GanE (j!)? | t—22 ) 


Setting t = sin @, and applying Corollary 10.3.2, 


1 42741 m/2 QW ahy2 
13 ' 925 (41 
[| Se- sinitt ogg — 220)" 
0 Vv1-?# 0 (27 +1)! 


Substituting, we see that 


giving the result. 


10.9 The irrationality of e” 


We have seen in Section 4.2 that e is irrational; we now show that e” is 
irrational for all non-zero rational r. If r = p/q and e” is rational, then 
e? = (e")? is rational, and so is e ? = 1/e?. It is therefore enough to show 
that e~* is irrational, for each positive integer k. Suppose, if possible, that 
e-* = p/q, where p and q are integers. 

We use the fact that if f is a differentiable function on [a, b] then 


d 


ele F(a) =e (EF (a) — f(a), 


which suggests the possibility of cancellation. 
Suppose that f is an (m+ 1)-times differentiable on [a, }]; set 


g(x) = —e (kK f(a) + RE f(x) +--+ f(z). 
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Then g'(x) = e—**(k™*+1 f(x) — f+) (z)), so that 


b 
§@) 910) = i eh (KIM f(y) — FOP ()) de. 


We apply this to the polynomial function 


(a) = POT AP _ Ee (\ (aya, 


ar, j 


on the interval [0,1]. We choose this function, because 3,,(0) = 0 and 
3") (0) = O0forl <h<nandforh > 2n.Ifh=n+ J, where0<j <n, 


hen 
Oya) = rp 9"(") — apa(*42) (°) 


which is an integer. Since ,,(”) = G,(1 — x), similar phenomena occur when 
x = 1. We now take m = 2n, and set 


Gn(x) = —e**(KP"B, (a) + PPB (aw) + --- + BP”)(z)). 


Then gn(1) = e~*r = pr/q and g(0) = s, where r and s are integers. Further, 


since B2r*) (a) = O.forall a, 


1 1 
gn(1) — gn(0) = i a (8 (2) = Be de = i eke R241 3, (x) da. 
0 0 


Thus 


1 
q ( e Rt p2ntl a (zr) i) = pr — qs, an integer. 
0 


But 0 < B,(x) <1/n! and e~** < 1 for0 < x <1, and so 


1 pont 
0< af e hepent) 6 (a) dx) <q ( ) ; 
0 


n! 


Since k?"+1/n! — 0 as n — on, it follows that 


1 
O<q (/ e BrR2ntl 9 (a) ir) el 
0 


for large enough n, giving the required contradiction. 
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10.10 The irrationality of 7 


We use the idea of the previous example to show that 7 is irrational. In fact, 
we do rather more, and show that 7? is irrational. Suppose that 7? = p/q, 
where p and q are integers. 
Suppose that u is a twice differentiable function on an interval [a,b]. Let 
g(x) = u(x) cos 72, 


h(x) = u' (x) sin re. 
Then 


g(x) = u'(x) cos 2x — ru(x) sin r2, 
h'(x) = ru'(x) cosnraz + u(x) sin ra, 
so that 
(h — 1g) (x) = (n?u(x) + u(x) sin rx. 
This again suggests the possibility of cancellation. 
Suppose that f is a 2n + 2-times differentiable function; set 
Ri) ge pa en NP a) age FOP): 
Then 
(h/m — g){(@) = (nw? f(a) + (-1)xf")(2)) sin rea. 
Since h(1) = h(0) = 0, it follow that 
1 
g(0) — g(1) = | (wr F(2) + (-1)"xf?)(2)) sin 7x dx. 
0 


Let us take f(x) = 8,(x), as before. Recall that 3,,(0) = 6,(1) = 0 and 
that 3) (0) and Bo (1) are integers for all h € N. Thus g”*19(0) = q”*!u(0) 
and q"*!9(1) = —q”"*+u(1) are both integers. Since eae (x) = 0 for all x, 


i 
g(0) — g(1) = / q?nt1 g(x) sin ra da, 


so that se 
nr 
g”*'g(0) = q’*g(1) = | B(x) sin rx dx 
0 


is an integer. But 
prt 1 prt 
Oi Bn(x) sin ra dx < ——, 
T Jo Tn! 
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and p+! /7.n! — 0 as n — oo, so that 


port 1 
0 | GB,(x) sinwa dx <1 
mT JO 


for large enough n, giving the required contradiction. 


Appendix A 


Zorn’s lemma and the well-ordering principle 


A.1 Zorn’s lemma 


We show that Zorn’s lemma is a consequence of the axiom of choice. 


Theorem A.1.1 Assume the ariom of choice. Suppose that (X,<) is 
a non-empty partially ordered set with the property that every non-empty 
chain (totally ordered subset) of X has an upper bound. Then there exists a 
maximal element in X. 


Proof! We need a few more definitions. Suppose that A is a subset of a 
partially ordered set (X,<) and that x € X. x isa strict upper bound for A 
if a < x for alla € A. A totally ordered set (.5,<) is well-ordered if every 
non-empty subset of S has a least element. A subset D of a totally ordered 
set (S,<) is an initial segment of S if whenever x € S,d€ Dand x < dthen 
ze D. 

We break the proof into a sequence of lemmas and corollaries. If A is a 
subset of X, let A’ be the set of strict upper bounds of A. If A = @, then 
Ab=X. 

Let C be the set of chains in X. 


Lemma A.1.2 Suppose that there exists C € C for which C! =. Then C 
has a unique upper bound, which is a maximal element of X. 


Proof C has an upper bound c. Then c € C, and is the unique upper bound 
for C, since C is a chain. If x > c, then xz is an upper bound for C, and so is 
equal to c. Thus c is a maximal element of X. 


We must therefore find a chain C' for which C’ = @. Let s be a choice 
function on P(X) \ @: if A is a non-empty subset of X then s(A) € A. We 


1 Thanks to Peter Johnstone for showing me how to simplify the proof. 
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use the choice function to define a successor function, and use this to find a 
large chain in C. 

Let T be the set of chains C in X with the property that if D is an initial 
segment of C and D # C then s(D’) is the least element of C \ D. 


Lemma A.1.3 7 9. 


Proof The set C = {s(X)} is a non-empty chain in J. Suppose that D is 
an initial segment in C. If D #4 C then D = @, and s(X) is the least element 
of C\ D. Thus C eT. 


Lemma A.1.4 [fC ET andC’ £90 thenCt =CU{s(C)} ET. 


Proof C* is certainly a chain, with greatest element s(C’). Suppose that D 
is an initial segment of Ct and that D 4 C*. There are two possibilities. 
First, D is a proper subset of C. Then s(D’) is the least element of D’N C, 
and is therefore the least element of D'NC*. Secondly, D = C. Then s(D’) = 
s(C’) is the least element of C* \ D. 


Corollary A.1.5 We order T by inclusion. It is enough to show that T 
has a greatest element M. 


Proof For if M’ #4 0, then M*€T, contradicting the maximality 
of M. 


We show that T is totally ordered by inclusion. 


Lemma A.1.6 Suppose that C,D € T, and that C is not contained in D. 
Then D is an initial segment of C. 


Proof Let 
E={xeECnD: ify<- then y € C if and only if y € D}. 


Then FE is an initial segment of both C and D. Since FE C D, E # C, and 
s(E’) is the least element of C \ E. Suppose that E 4 D. Then s(E’) is the 
least element of D\ EF. But this implies that s(E’) € E, giving a contradiction. 
Thus D = E, and so D is an initial segment of C. 


Lemma A.1.7 Let M =UcerC. Then M ET. 


Proof M is certainly a chain in X. Suppose that D is an initial segment in 
M, and that D4 M. Then Dt C M, and so s(D’) € M. If x € M \ D then 
x € C\ D for some C € T, and s(D’) is the least element of C \ D. Thus 
s(D’) < x, and so s(D’) is the least element of M \ D. Thus M € T. 
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Corollary A.1.8 MM is the greatest element of T. 


Proof For if C € 7, then C C M, by the definition of M. 


This completes the proof of Theorem A.1.1. 


A.2 The well-ordering principle 


Zorn’s lemma implies that any non-empty set can be well-ordered. 


Theorem A.2.1 (The well-ordering principle) If S is a non-empty set, 
Zorn’s lemma implies that there is a total order on S under which S is 
well-ordered. 


Proof Wesketch the proof, and leave it to the reader to supply all the details. 
Let T be the set of all pairs (A,<,), where A is a non-empty subset of S 
and <, is a well-ordered total order on A. T is not empty (consider singleton 
sets). We define a partial order on T by setting (A, <4) < (B,<z) if A is 
an initial segment of B, and the two partial orders agree on A: x <4 y if 
and only if « <p y. We argue as in Theorem A.1.1. If C is a chain in T, set 
M = U{A: (A, <a) € C}. If z,y € M, there exists (A, <4) € C such that 
x,y € A. Set x <y y if x <, y. This is well defined, and defines a total 
order on M. This total order is a well-ordering of M, so that (M,<ys) € T. 
If (A, <4) € C then A is an initial segment of (IV, <j,). Hence (M, <a) is 
an upper bound for C’. We can apply Zorn’s lemma: there exists a maximal 
element (N,<w) of T. We claim that N = X. If not, there exists z € X \ N. 
Let N’ = N U {z}. Define a partial order <j on N’ by setting x <yr y if 
x,y € Nand gz <y y, and by setting « <y z for all x € N’. Then N’ € T, 
and (N, <j) is strictly less than (N’, <j), contradicting the maximality of 
(N, <n). 


There are circumstances in which it may be more convenient to prove 
results using the well-ordering principle, rather than the axiom of choice. 
The well-ordering principle is used to develop the theory of ordinals. 

It is easy to deduce the axiom of choice from the well-ordering 
principle. 


Theorem A.2.2 The well-ordering principle implies the axiom of choice. 


Proof Suppose that {Ba}aca is a non-empty family of non-empty sets. Let 
X = UgeaBa, and let < be a well-ordered total order on X. If a € A, let 
c(a) be the least element of By. Then c is a choice function. 
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Exercises 


A.2.1 Give the details of the proof of Theorem A.2.1. 

A.2.2 Use the well-ordering principle to prove Theorem 1.9.2. 

A.2.3 Modify the proof of Theorem A.1.1 in the following way to deduce the 
well-ordering principle directly from the axiom of choice. 

Let s be a choice function on the non-empty subsets of a non-empty 
set X. Let T be the set of pairs (C,<c), where C is a subset of X and 
<c is a well-ordered total order on C' with the property that if D is an 
initial segment of C and C \ D is not empty then s(X \ D) is the least 
element of C \ D. 

(i) Suppose that (C <c) and (D <p) are elements of X, and that C is 
not contained in D. Show that D is an initial segment of C’, and that 
the two total orders <c¢ and <p agree on D. 

(ii) Let M = {C: (C <c) © T}. Define a total order <y on M, 
verifying that it is well-defined, and show that (M,<yz) € T. 

(iii) Show that M = X. 
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